Robust Control Synthesis in the Time Domain
نویسندگان
چکیده
This thesis investigates the synthesis of controllers for time varying systems in order to satisfy an induced 2-norm closed loop performance bound. This performance criterion is a generalisation of the well known H1 norm criterion used in the frequency domain for analysis and synthesis of linear time invariant control systems. A number of di erent time varying system frameworks are considered, for which there are no frequency domain counterparts. One such class is that of aperiodic sampleddata systems, that is continuous time systems connected to a discrete controller via sampling and hold devices. Multiple generalized sampling and hold devices, which may be aperiodic and asynchronous, are permitted within the framework considered in this thesis. Using game theory, necessary and su cient conditions are given for the existence of controllers satisfying a prespeci ed performance bound for such multi-rate sampleddata systems, and expressions for such a controller are given if one exists. In the state feedback case this result is generalised to include time varying systems with possible discontinuities in the state vector, of which sampled-data systems are a special case. Another inherently time domain problem investigated in this work is the moving horizon H1 controller. The moving horizon problem was originally formulated as a method of stabilising time varying systems without requiring information about the system matrices over all future time, by minimizing a quadratic cost function up to some nite time ahead at each instant. In this thesis the method is generalized so that a di erential game with quadratic cost function is solved up to a nite time ahead. Explicit constructions are given in both the state and output feedback case corresponding to this min-max problem, based upon state estimation using information from a nite time in the past. It is shown that with the moving horizon controller it is possible to synthesize controllers which are not only stabilising but also satisfy some prespeci ed closed loop induced norm bound over all future time. The moving horizon problem was originally formulated as a purely continuous time problem. In this thesis it is generalized also to the case when the controller is updated only on discrete, possibly overlapping, intervals. It is shown that, if the terminal weights satisfy certain conditions, then it is possible to synthesize controllers which are -feasible by solving separate dynamic games on each nite interval. Keywords: sampled-data systems, robust control, time varying systems, moving horizon control. i PREFACE Many people have helped me over the past three years to produce this thesis and it is a great pleasure for me to take this opportunity to express my gratitude to them all. First and foremost, I would like to thank my supervisor, Professor Keith Glover, for providing me with the bene t of his insight and for giving his time generously to discuss the ideas that have made up this research. I am also grateful to the members of CUED control group for making the working environment a stimulating and enjoyable one. In particular I would like to thank Geir Dullerud for reading preliminary versions of the manuscript, and for providing the original motivation for Chapter 4. During this research I spent two months at the Institutionen f or Reglerteknik, Tekniska Hogskolan i Lund, and I am especially grateful to Anders Rantzer for making this opportunity available to me, and to all the members of the control group in Lund for making the experience memorable and enjoyable. I would also like to thank Professor Haroon Ahmed for allowing me the freedom to choose my research direction, and for his guidance, support and careful advice. Brian Wootton has provided a ttingly robust computer system without which I could not have written this thesis. Sharon Heise, Geir Dullerud, Roger Bacon, Haig Utidjian, Marc Jacobs, and Mark Lewis have all made life during the past three years in Cambridge especially worth living; on the beach, on the road, at the Tap and Spill and in the basement of Number 49. My time in Cambridge would not have been the same without them. Finally, I would like to express my heartfelt thanks to my parents and my brother Dev for their continuing encouragement and support. This research has been supported nancially by the Science and Engineering Research Council of the United Kingdom, and by Cambridge University Engineering Department. As required by University statute, I hereby declare that this dissertation is not substantially the same as any that I have submitted for a degree at any other university, is the result of my own work, and includes nothing which is the outcome of work done in collaboration. Sanjay Lall Cambridge 16th June 1995 ii CONTENTS 1 Introduction 1 1.1 Organisation of the thesis : : : : : : : : : : : : : : : : : : : : : : : : : 4 2 Preliminaries 7 2.1 Signals and Systems : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7 2.2 Linear Operators : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 8 2.3 Stability of LTV systems : : : : : : : : : : : : : : : : : : : : : : : : : : 9 3 State Space H1 Theory 10 3.1 H1 and di erential games : : : : : : : : : : : : : : : : : : : : : : : : : 10 3.2 Zero-sum games : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 11 3.3 Discrete time dynamic games with full state information : : : : : : : : 14 3.3.1 Nonlinear systems : : : : : : : : : : : : : : : : : : : : : : : : : : 14 3.4 Continuous time dynamic games with full state information : : : : : : : 18 3.4.1 Nonlinear continuous time dynamic games : : : : : : : : : : : : 18 3.4.2 The linear-quadratic game : : : : : : : : : : : : : : : : : : : : : 19 3.4.3 Completion of squares : : : : : : : : : : : : : : : : : : : : : : : 20 3.5 Continuous time games with imperfect measurement : : : : : : : : : : 22 3.5.1 Nonlinear systems : : : : : : : : : : : : : : : : : : : : : : : : : : 22 3.5.2 The linear case : : : : : : : : : : : : : : : : : : : : : : : : : : : 27 3.6 Discrete time games with imperfect measurement : : : : : : : : : : : : 37 3.6.1 The information state approach : : : : : : : : : : : : : : : : : : 37 3.6.2 The linear case : : : : : : : : : : : : : : : : : : : : : : : : : : : 43 3.7 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 47 4 Multi-rate Sampled-data H1 Control 48 4.1 Motivation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 48 4.2 State feedback : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 50 4.2.1 Systems with jumps : : : : : : : : : : : : : : : : : : : : : : : : 50 4.2.2 Single hold : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 50 4.2.3 Necessary conditions for state feedback control : : : : : : : : : : 53 4.3 Single-rate sampled-data control : : : : : : : : : : : : : : : : : : : : : : 54 4.4 Multi-rate controller synthesis : : : : : : : : : : : : : : : : : : : : : : : 56 4.4.1 State feedback with a multi-rate hold: A simple example : : : : 56 4.4.2 The general multi-rate case for jump systems : : : : : : : : : : : 58 4.4.3 The general multi-rate case for sampled-data systems : : : : : : 61 4.5 One step delayed output feedback : : : : : : : : : : : : : : : : : : : : : 62 4.5.1 Problem formulation : : : : : : : : : : : : : : : : : : : : : : : : 62 4.5.2 An expression for the information state : : : : : : : : : : : : : : 64 4.5.3 Recoupling : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 70 iii Contents iv 4.5.4 Necessity : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 73 4.6 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 74 5 Riccati Di erential Inequalities 76 5.1 Preliminaries : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 76 5.2 Inequalities via DGKF separation : : : : : : : : : : : : : : : : : : : : : 77 5.2.1 The bounded real lemma : : : : : : : : : : : : : : : : : : : : : : 77 5.3 Connections to di erential games : : : : : : : : : : : : : : : : : : : : : 85 5.3.1 State feedback : : : : : : : : : : : : : : : : : : : : : : : : : : : : 86 5.3.2 Output feedback : : : : : : : : : : : : : : : : : : : : : : : : : : 87 6 Moving Horizon H1 Control 94 6.1 Motivation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 94 6.2 State feedback : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 95 6.2.1 The moving horizon di erential game : : : : : : : : : : : : : : : 95 6.2.2 Stability of the receding horizon controller : : : : : : : : : : : : 98 6.2.3 In nite horizon norm bounds : : : : : : : : : : : : : : : : : : : 100 6.3 Measurement feedback : : : : : : : : : : : : : : : : : : : : : : : : : : : 104 6.4 Terminal constraints : : : : : : : : : : : : : : : : : : : : : : : : : : : : 111 6.5 Recursive computation of X1 : : : : : : : : : : : : : : : : : : : : : : : 116 6.6 Robust Performance and LMIs : : : : : : : : : : : : : : : : : : : : : : : 118 6.7 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 121 7 Discrete Interval Moving Horizon Control 122 7.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 122 7.2 State feedback : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 122 7.3 Conclusions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 128 8 Conclusions 129 8.1 Suggestions for future research : : : : : : : : : : : : : : : : : : : : : : : 130 Bibliography 131 NOTATION For the purpose of reference, the following glossary outlines the notation used in the text. Z the integers Z+ the non-negative integers R, R+ the real and non-negative real numbers Rp q the real matrices with p rows and q columns L p q 1 the set of matrix functions M : R+ ! Rp q such that there exists > 0 such that (M(t)) < for all t 2 R+ LTI linear time invariant LTV linear time varying RDE Riccati di erential equation RDI Riccati di erential inequality LMI linear matrix inequality := right hand side de nes the left hand side L n 2 [t0; t1] the set of square integrable functions mapping [t0; t1] to Rn L n 2 the set of square integrable functions mapping [0;1) to Rn `n2 [k0; k1] sequences mapping i 2 Z k0 i k1 to Rn `n2 square summable sequences mapping Z+ to Rn j j Euclidean norm on Rn k k2 norm on L2 or `2 A0 transpose of real matrix A G adjoint of operator G (A) maximum singular value of matrix A D? 12 the extension of D12 such that, if D12(t)0D12(t) = I for all t, then D12(t) D? 12(t) is orthogonal for all t. D? 21 the extension of D21 such that, if D21(t)D21(t)0 = I for all t, then D21(t)0 D? 21(t)0 is orthogonal for all t. [k0;k1] projection operator mapping `2[j0; j1] to `2[k0; k1], where [k0; k1] [j0; j1], such that ( [k0;k1]u)(k) = u(k) for k 2 [k0; k1]. u[k0;k1] [k0;k1]u S sampling operator H hold operator v 1. INTRODUCTION This thesis is concerned with the synthesis of feedback controllers for uncertain systems. Given a mathematical model of a physical system, we would like a control methodology to provided answers to two basic questions: Can the speci ed performance be achieved? If so, which designs achieve it? This task is further complicated by the fact that we only have an approximate mathematical model of the physical system to be controlled. We would like to construct a controller that gives good performance for the physical system itself, not just for the model. In order to achieve this, we assume that the physical system is in some speci ed set of mathematical models. In robust control theory, this set is usually speci ed by a nominal model and a set of possible perturbations. The goals of feedback are to desensitize the control system to the e ects of both unmodelled system dynamics and exogenous disturbances. We would also like to stabilize the system, since the presence of uncertainty means that we cannot control its inputs exactly. Originally formulated by Zames [81] in 1981, H1 control theory has emerged as an extremely successful methodology for providing answers to the above design questions for complex multivariable systems. The motivation was based on frequency domain ideas, with most of the original results being for linear time invariant systems. The fundamental result of robust control theory is the small gain theorem. One measure of the gain of a linear multivariable system G is given by kGk := sup u6=0 kGuk2 kuk2 : This quantity is known as the induced norm, or operator norm, of G. The small gain theorem then states that, if in the feedback diagram of Figure 1.1 the product of the norms kGkkKk is strictly less than 1, then the closed loop system will be stable. KG Figure 1.1: The small gain theorem In the H1 control paradigm, the H1-norm of a system G is de ned in terms of its transfer function G as kGk1 := sup ! (G(j!)) That is, it is the maximum singular value of the transfer function over all frequencies. It is a well-known result that the induced L2 norm is then equal to the H1-norm. 1 1 Introduction 2 From the small gain framework, we are led to the standard problem setup of Figure 1.2. In this case, the goal is to nd a controller K such that the closed loop is stable, and that the induced norm of the resulting closed loop operator from w to z is less than some prespeci ed level . We will refer to such controllers as -suboptimal or -feasible. We would also like to know when no such controller exists. KG yz wu Figure 1.2: The H1 Standard Problem This problem corresponds in a natural way to that of disturbance rejection, by regarding w as a disturbance input and z as a controlled output. Further, the small gain theorem immediately implies that with any system perturbation connected between z and w, if the norm k k < 1, then the closed loop system will remain stable. This property is known as robust stability. There are many further re nements of this idea, to include structured system perturbations, and to include robust performance as well as robust stability. Reviews of this material can be found in the books by Dahleh [21] and Green and Limebeer [29]. The rst approaches to the solution of this problem were made using frequency domain techniques. However, a major development in H1 control theory was the introduction of state space methods. Simple and intuitive solutions to theH1 synthesis problem were presented by Glover and Doyle [28] using these techniques. This approach is particularly natural and appealing for the problem of H1 control of linear time varying systems, and of sampled-data systems. Although originally formulated in the frequency domain, the induced norm is still well-de ned for linear time varying systems also, although the term \H1-norm" has become a commonly used misnomer for the induced norm of such systems. After the state space solutions of Glover and Doyle [28] and Doyle, Glover, Khargonekar and Francis [22] were derived, connections were made to a number of other research areas, in particular that of risk sensitive control as developed by Whittle [78]. Further connections were made to dynamic game theory by Ba sar and Bernhard [7] and Limebeer et. al. [50]. In both these areas, solutions are developed in the time domain, using a separation principle to allow treatment of control and ltering problems separately. James [38] further develops this principle to give exact conditions for when the separation principle holds. Dynamic games The dynamic game formulation of H1 control is centred around the following simple observation. For the closed loop system with the controller K in place, the induced norm kTzwk if and only if kzk22 2kwk22 0 for all w 2 L2 1 Introduction 3 which is if and only if sup w2L2 kzk22 2kwk22 0 We might therefore regard this as a suitable cost function for a controller synthesis procedure to try and minimize. If the resulting minimum is negative or zero, then the controller will achieve the induced norm bound given by . Further, if the resulting minimum is greater than zero, we know that there exists no controller achieving the desired induced norm bound. We therefore consider the problem inf K sup w2L2 kzk22 2kwk22 which is a dynamic game problem. We can regard the controller K as trying to minimize the worst case cost function which the \malicious" disturbance w can produce. The game theoretic formulation also allows us to consider nite horizon problems, that is problems in which we are concerned only with controlling the system for a nite time. In this case, the signal norms, and the induced norm, are de ned on a nite interval [t0; t1]. In this case, a suitable cost function is of the form inf K sup w2L2 Z t1 t0 z(t)0z(t) 2w(t)0w(t) dt+ x(t1)0Q1x(t1) with Q1 a xed positive de nite matrix weight on the terminal state. Clearly, with x0 = 0, for any xed Q1 0, if there exists a minimax controller K then this will result in an induced norm less than or equal to on the nite horizon. That is, sup w2L2[t0;t1] k [t0;t1]zk kwk : One particular application of nite horizon synthesis techniques has been in the area of moving horizon control. In this methodology, at each time t the controller is designed for the nite intervals [t; t + T ]. The controller is then used on the possibly in nitesimal interval [t; t + ] before being recomputed. This technique was motivated by Kleinman [44] and Kwon and Pearson [46] in the early 1970s for use with linear quadratic optimization problems. It was shown that the resulting controllers were stable when applied on an in nite horizon. This allowed the synthesis for linear time varying systems without full future knowledge of the plant parameters. In this thesis, we extend the above moving horizon formulation to the game theoretic induced norm formulation. We show that moving horizon control with a minimax criterion not only gives stability, but also allows synthesis of -suboptimal controllers on an in nite horizon without explicit knowledge of the system parameters for more than a nite time ahead. We will consider both state feedback and output feedback synthesis techniques, and both continuous and discrete update of the control law. Sampled-data systems Physical systems generally evolve in continuous time. However, a controller will usually be implemented using digital hardware, and hence must consist of a discrete control law. 1.1 Organisation of the thesis 4 Such systems are described as hybrid, or sampled-data systems. A common approach to the synthesis problem has been to either start with a continuous model of the system, construct a continuous time controller using well-known techniques, and then construct a discrete controller which approximates the continuous controller. Alternatively, the designer might begin with a discrete model for the continuous physical system. Clearly, such a design process cannot take account of the behaviour of the system between the sampling instants. Another reason for the construction of discrete controllers for continuous systems is that some systems can only provide information at discrete time instants. For example, in order to produce a measurement of the state of a reaction in a chemical process, it might be necessary to perform tests on a small sample of the reactants. For these kind of systems, we would like to be able to explicitly construct discrete controllers for continuous time systems. We would like a synthesis procedure which, using a continuous time performance and robustness measure, gives us the achievable performance level using a discrete controller with a given sampling rate. Hara and Kabamba [31], and Bamieh and Pearson [3], have given H1 synthesis procedures for sampled-data systems using the technique of lifting. In the case when the sample and hold operators are of xed period and synchronized with each other, it is possible to transform the sampled-data system to a discrete system, whose inputs and outputs are at each time step in in nite dimensional spaces. It is then possible to construct a discrete system such that any -suboptimal discrete controller for the discrete system will also be -suboptimal for the original sampled-data system. The technique of lifting fails, however, when the sample or hold operators are non periodic, or when they are not synchronized, or their rates not rationally related. The technique of lifting is extremely specialized to the particular periodic structure of the problem, since by a minor perturbation of the sampling and hold rates the problem becomes truly time varying and this technique no longer applies. The direct time domain formulation of the dynamic game allows consideration of the multi-rate sampled-data problem, with non-periodic sample and hold devices. Further, we allow multiple sample and hold devices on di erent input and output channels, and construct nite horizon suboptimal H1 controllers for such systems. 1.1 Organisation of the thesis Chapter 2: Preliminaries Chapter 2 describes the basic signal and operator spaces used throughout the text, and assembles the state space mathematical descriptions we will use for linear time varying systems. We state the de nitions of exponential stability, stabilizability and detectability that we will use throughout this thesis. Chapter 3: State space H1 theory In this chapter we describe the basic de nitions of zero-sum static and dynamic games, and the idea of a saddle point. We show that theH1 problem is equivalent to a dynamic game problem over certain strategy spaces. In the output feedback case, we consider 1.1 Organisation of the thesis 5 the separation theories of Whittle [78] and Ba sar and Bernhard [7], and also the use of information state for separation as developed by James, Baras and Elliott [40]. We give a new simple proof of the separation principle for the continuous time output feedback problem, and an explicit characterization of the discrete systems for which separation can be applied. Chapter 4: Multi-rate sampled-data H1 control We rst consider the state feedback problem for a very general class of systems with possible jumps in the state vector. We then specialize to sampled-data systems, state the formulation of multi-rate sampled-data control we will use, and consider its solution by means of simple examples in the state feedback case. For nite horizon problems, we give necessary and su cient conditions for existence of multi-rate sampled-data controllers given particular sample and hold operators, and if there exists such a controller we give an explicit construction technique. We show that the minimax controller requires knowledge of all future hold operations, but does not require knowledge of all future sampling operations. Chapter 5: Riccati di erential inequalities In this chapter we present a solution of the continuous time in nite horizon linear time varying H1 synthesis problem, using the separation theory of Doyle, Glover, Khargonekar and Francis [22]. We extend the results of Khargonekar, Ravi and Nagpal [43] to replace the Riccati di erential equations with Riccati di erential inequalities, and remove a particular stability assumption on the solutions to these inequalities. We then obtain similar results using game theoretic techniques, by perturbing the past and future cost functions. Chapter 6: Moving horizon H1 control We now use the results of the previous chapter to construct moving horizon H1 controllers. We construct continuously updated moving horizon controllers in both the state and output feedback case, and show that for particular choices of initial and terminal state weights the resulting controllers are both stabilizing and -suboptimal. We show that this corresponds to a monotonicity condition in the linear time invariant case, and that the solutions are also solutions to the LMI formulation of H1 control. Chapter 7: Discrete Interval Moving Horizon Control This chapter shows how we can extend the continuously updated moving horizon controllers to the discretely updated case. It shows that we can construct H1 controllers piecemeal over time, by diving the time interval up into possibly overlapping segments and implementing di erent controllers in each segment. The boundary conditions obtained for this problem are related to the monotonicity conditions of Chapter 6. 1.1 Organisation of the thesis 6 Chapter 8: Conclusions This chapter summarizes the contributions of this thesis, and outlines directions for future research into H1 control in the time domain. 2. PRELIMINARIES 2.1 Signals and Systems We begin with this section which describes the basic continuous and discrete time systems we will encounter throughout this thesis. The following material is standard and can be found, for example, in [14]. Let Rn be n-dimensional Euclidean space, with norm j j de ned by jxj2 =Pni=1jxij2, where x 2 Rn is given by x = (x1; : : : ; xn). The continuous signal spaces we will consider will be the Hilbert spaces L n 2 of Lebesgue measurable functions mapping [0;1) to Rn, de ned by L n 2 := u : [0;1)! Rn kuk22 := Z 1 0 ju(t)j2 dt <1 : We also de ne the L2 norm on a nite interval by L n 2 [t0; t1] := u : [t0; t1]! Rn kuk22 := Z t1 t0 ju(t)j2 dt <1 Often we will consider systems on a nite horizon. By this, we mean that the inputs and outputs will be in L2[t0; t1], for some interval [t0; t1]. The extended space L2e is de ned to be the space of functions on [0;1) such that u 2 L2e if for every T > 0, the restriction of u to the nite interval [0; T ] is a member of L2[0; T ]. We also have the discrete time Hilbert space `n2 := u : Z+ ! Rn kuk22 := 1 Xi=0 ju(i)j2 <1 and its nite horizon counterpart `n2 [j; k] := fu : [j; k]! Rng kuk22 := k Xi=j ju(i)j2: Before discussing systems with inputs and outputs, we rst consider the time varying linear di erential equation _ x(t) = A(t)x(t) x(t0) = x0 (2.1) where x(t) 2 Rn for t 2 R+, and A 2 L n n 1 is a bounded matrix function of time. Since A is bounded, the right hand side of this ordinary di erential equation is a globally Lipschitz function of x(t), and hence a solution always exists for all t t0. Proofs of these results can be found in, for example [61, 16]. Then de ne , the transition matrix of A by @ @t (t; s) = A(t) (t; s) (s; s) = I The solution of (2.1) is then given by x(t) = (t; t0)x(t0) 7 2.2 Linear Operators 8 Useful properties of . For all s; t 2 R+, a) (s; s) = I b) (s; t) = (s; t1) (t1; t) c) (s; t) 1 = (t; s) d) @ @s (t; s) = (t; s)A(s) e) If A is time invariant, then (t; s) = e(t s)A f) det (t; s) = eR t s traceA( ) d g) if (t; s) = (s; t)0, then @ @t (t; s) = A(t)0 (t; s) We will describe a nite dimensional linear time varying system G by a state space representation _ x(t) = A(t)x(t) +B(t)u(t) x(t0) = x0 y(t) = C(t)x(t) +D(t)u(t) (2.2) on either a nite horizon with t 2 [t0; t1] or an in nite horizon with t 2 R+. Here x(t) 2 Rn is the state of the system, and we assume u 2 L m 2 and y 2 L p 2 . The system matrices satisfy A 2 L n n 1 , B 2 L n m 1 , C 2 L p n 1 , and D 2 L p m 1 . We will describe this system as strictly proper if D 0. In the in nite horizon case, this mapping may only be de ned for particular inputs u and initial states x0, since it is possible for an L2 input signal to give rise to an output signal y which does not have a nite 2-norm. We may also write this system as G = A B C D : For any initial state x0 2 Rn, the output of the system can be expressed in terms of the variation of constants formula y(t) = C(t) (t; t0)x0 + C(t) Z t t0 (t; s)B(s)u(s) ds: This formula can be veri ed easily by direct di erentiation. 2.2 Linear Operators Let G be a linear map from X to Y , where X , Y are normed spaces. Then the induced norm of G is de ned by kGk := sup x6=0 kGxk kxk If kGk is nite then we describe G as a bounded operator, and kGk is known as the operator norm. 2.3 Stability of LTV systems 9 For a matrix A, the induced 2-norm is given by kAk := sup x6=0 kAxk kxk = (A) where (A) is the maximum singular value of A. (A) is also the square root of the largest eigenvalue of AA . For a matrix function X : R+ ! Rp q, X 2 L p q 1 , we will use the notation kXk1 := sup t 0 kX(t)k to indicate the maximum norm of X over all time. For a symmetric positive de nite matrix X, kXk < if and only if X < I. 2.3 Stability of LTV systems Consider the system G de ned by equations (2.2). Assume x0 = 0, so that y is a linear function of u. We have the following de nition. De nition 2.1. The system G is said to be exponentially stable if there exist c1; c2 > 0 such that k (t1; t2)k c1e c2(t1 t2), where (t1; t2) is the transition matrix for A(t). Note that this de nition depends on the homogeneous part of the system only, and we will use the phrase \A is stable" also to mean that G is stable. The following lemma is well known. Related results are given in [74, 1, 65]. Lemma 2.2. If G is exponentially stable, and x0 = 0, then the input output map G : L2 ! L2 is a bounded linear operator, and its induced norm is de ned as kGk = sup kuk6 =0 kyk2 kuk2 : In order to state a converse result, we need the notions of stabilizability and detectability. The following de nitions can be found in [58, 43]. De nition 2.3. The system G, or the pair (A;B), is said to be stabilizable if there exists a bounded matrix function F such that the system _ x(t) = A(t) B(t)F (t) x(t) is exponentially stable. De nition 2.4. The system G, or the pair (A;C), is said to be detectable if there exists a bounded matrix function L such that the system _ x(t) = A(t) L(t)C(t) x(t) is exponentially stable. We can now state the following result. Lemma 2.5. If G is stabilizable and detectable, then it is exponentially stable if and only if G is a bounded operator on L2. 3. STATE SPACE H1 THEORY 3.1 H1 and di erential games Our goal in this chapter is to present some of the results that have been developed using time domain techniques to construct solutions to the H1, or induced 2-norm, synthesis problem. We will concentrate on the game theoretic formulations, and will use both the nonlinear concept of information state, rst used in H1 theory by James and Baras [40], and the idea of worst case disturbances, as developed by Ba sar and Bernhard [7] and Whittle [78]. Throughout this thesis we will consider what has become known as the \standard" H1 synthesis problem. That is, we are given a causal linear system G mapping inputs w; u to outputs z; y. The inputs are assumed in L2 or `2, depending on whether the problem is continuous or discrete time. We regard z as the controlled output, w as the disturbance input, u as the controlled input and y as the measured output. Then we are looking for a causal controller K to connect between y and u so that the the induced norm of the closed loop map from w to z is minimized. This is also known as the disturbance rejection problem. If we partition the system G as KG yz wu Figure 3.1: The H1 Standard Problem z y = G11 G12 G21 G22 wu and denote the closed loop map by Tzw, then Tzw = G11 + G12K(I G22K) 1G21. We would then like to choose K such that the induced norm kTzwk := sup w2L2;w 6=0 kTzwwk2 kwk2 is minimized. We also have the further requirement that the closed loop is well-posed and internally stable, that is, for every input signal, the signals y and u are well-de ned. Current approaches to this problem to the stated level of generality are via the suboptimal synthesis problem. Given a real number > 0, we look for a causal controller K such that kTzwk < , and such that the closed loop is well-posed and internally stable. We will also refer to such a controller as -feasible. Obviously, given such a synthesis procedure it is then possible to iterate on to achieve a controller that 10 3.2 Zero-sum games 11 performs arbitrarily close to optimally. We will use game theory to construct such -feasible controllers. In this chapter we will rst describe the background game theory, for both static and dynamic games. We will show how the H1 problem is related to the theory of di erential games, and make use of the Hamilton-Jacobi-Bellman equation to construct state feedback -feasible controllers. For measurement feedback problems, we give an overview of the current results on certainty equivalence, and provide a new direct derivation for the continuous time linear-quadratic case. We then solve the nite horizon time varying problem in the state and output feedback cases. 3.2 Zero-sum games We will now describe a broad class of problems called zero-sum games. Most of the following material is standard and can be found in [8], for example. Let U and W be vector spaces, and U1 U , W1 W be subsets. Let J : U1 W1 ! R. De ne J := inf u2U1 sup w2W1 J(u; w) J := sup w2W1 inf u2U1 J(u; w) We call J the upper value of the game, and J the lower value of the game. The function J is the cost function, or kernel, of the game and u and w are regarded as players. The objective of player u is to minimize J and the objective of player w is to maximize J , with neither player having access to the strategy of the other. Equivalently we might describe this as the players making their decisions simultaneously. Remark 3.1. In the de nition of J , we allow the function supw2W1 J(u; w) to take in nite values for some u, and regard J as well de ned provided that this function is bounded below with respect to u, and nite for some u. A similar statement holds for J . A natural question to ask is, are these two values equal? That is, can we interchange the sup and inf operations in the above expressions? The answer in general, is no. The following is easily seen: Lemma 3.2. Suppose that J and J are well de ned. Then J = inf u2U1 sup w2W1 J(u; w) sup w2W1 inf u2U1 J(u; w) = J Proof. inf u2U1 J(u; w) J(u; w) 8u 2 U1; w 2 W1 sup w2W1 inf u2U1 J(u; w) sup w2W1 J(u; w) 8u 2 U1 sup w2W1 inf u2U1 J(u; w) inf u2U1 sup w2W1 J(u; w) 3.2 Zero-sum games 12 For any given strategy of the u player, the worst the w player can do is use a strategy which achieves sup w2W1 J(u; w) assuming that the maximum is achieved. So the u player would like to play a strategy which achieves the minimum in inf u2U1 sup w2W1 J(u; w): This is a worst case strategy for the u player, since no other strategy can guarantee a better outcome for player u against all possible moves of player w. The worst case cost for the u player is therefore J . A similar philosophy holds for player w, and clearly the worst case outcome is given by J in this case. In order to analyse the situation where both players choose strategies according to this scheme, we make the following de nition. De nition 3.3 (Saddle Point). The game de ned by the kernel J is said to have a saddle point if there exists û 2 U1, ŵ 2 W1 such that J(û; w) J(û; ŵ) J(u; ŵ) for all u 2 U1 and w 2 W1. Lemma 3.4. Suppose that J and J are well de ned, and that the game has a saddle point at û, ŵ. Then inf u2U1 sup w2W1 J(u; w) = J(û; ŵ) = sup w2W1 inf u2U1 J(u; w) Proof. Since the game has a saddle point, there exists û 2 U1, ŵ 2 W1 such that J(û; w) J(û; ŵ) J(u; ŵ): for all u 2 U1 and w 2 W1. Therefore sup w2W1 J(û; w) = J(û; ŵ) = inf u2U1 J(u; ŵ): Also, clearly inf u2U1 sup w2W1 J(u; w) sup w2W1 J(û; w) and inf u2U1 J(u; ŵ) sup w2W1 inf u2U1 J(u; w): Therefore inf u2U1 sup w2W1 J(u; w) J(û; ŵ) sup w2W1 inf u2U1 J(u; w): From Lemma 3.2, the result follows. 3.2 Zero-sum games 13 Lemma 3.5. Suppose that J and J are nite, and that u0 2 U1 achieves the in mum in inf u2U1 sup w2W1 J(u; w) and w0 2 W1 achieves the supremum in sup w2W1 inf u2U1 J(u; w): Then if J = J, the strategy pair (u0; w0) is a saddle point. Proof. Clearly J(u0; w0) sup w2W1 J(u0; w) = inf u2U1 sup w2W1 J(u; w) and J(u0; w0) inf u2U1 J(u; w0) = sup w2W1 inf u2U1 J(u; w): Hence, since J = J ,inf u2U1 sup w2W1 J(u; w) = J(u0; w0) = sup w2W1 inf u2U1 J(u; w): Then J(u0; w0) = sup w2W1 inf u2U1 J(u; w) = inf u2U1 J(u; w0) J(u; w0) 8 u 2 U1 and a similar inequality holds for w, giving the required result. If J = J then we call this the value of the game and denote it by J . Note that there are games which have a value but not a saddle point, that is, although J = J there is no u0; w0 for which J(u0; w0) = J , since the required in ma and suprema are not achieved. There are also games which have neither value nor saddle point. For example, the zero sum game with cost function J(u; w) = (u w)2 de ned for u; w 2 [0; 2] R has J = 1 and J = 0, and hence does not have a value. An example of such a game in the dynamic case is given by Stoorvogel [63, p. 123]. The following result will be useful in the sequel. Theorem 3.6. Let u (w) = u0 2 U1 inf u2U1 J(u; w) = J(u0; w) w (w) = w0 2 W1 sup w2W1 J(u; w) = J(u; w0) Then, u0 2 u (w0) and w0 2 w (u0), if and only if u0, w0 is a a saddle point solution. 3.3 Discrete time dynamic games with full state information 14 Proof. (only if:) Suppose u0 2 u (w0) and w0 2 w (u0). Then inf u2U1 J(u; w) = J(u0; w) sup w2W1 J(u; w) = J(u; w0) so J(u0; w) J(u0; w0) J(u; w0) for all u 2 U1, w 2 W1. (if:) Suppose u0, w0, are a saddle point solution for J . Then J(u0; w0) J(u; w0) for all u 2 U1, and so u0 2 u (w0). Similarly for w, giving the required result. Note that if, for each w, u (w) is a single point, and for each u, w (u) is a single point, then this result says that (u; w) is a saddle point if and only it is a xed point of the map (u; w) 7! (u (w); w (u)). 3.3 Discrete time dynamic games with full state information 3.3.1 Nonlinear systems Although in this thesis we will be concerned solely with linear systems and quadratic cost functions, many of the basic results we will use from the theory of dynamic games apply equally well to nonlinear systems. Because of this, we will rst state the results in the more general nonlinear framework, and later apply them to linear systems. In this section, we will consider nite dimensional nonlinear time varying systems on a nite horizon described by x(k + 1) = f(k; x(k); u(k); w(k)) x(0) = 0 (3.1) z(k) = g(k; x(k); u(k)) (3.2) Here the state x(k) 2 Rn, the disturbance w 2 `m1 2 [0; K], and the control input u 2 `m2 2 [0; K], with 0 k K. Let U n;p sf [k;K] := f : [k;K] Rn ! Rpg be the space of all state feedback strategies on the discrete interval [k;K]. In the sequel we will usually omit the spatial dimensions. We will consider a problem where both players have access to the state of the system, that is u(t) = (t; x(t)); 2 U n;m2 sf [0; K] w(t) = (t; x(t)); 2 U n;m1 sf [0; K]: The cost function we will consider is J(x(0); ; ) = K Xi=0njz(i)j2 2jw(i)j2o: This has obvious connections to the induced 2-norm of the system, since if the cost function J(x(0); ; ) 0 then clearly kzk22 2kwk22, although as yet we have not shown any direct relation between the -suboptimal synthesis problem and the dynamic 3.3 Discrete time dynamic games with full state information 15 game problem. The dynamic game problem is, given x(0), to nd a saddle point for the problem inf 2Usf sup 2Usf J(x(0); ; ); that is, to nd 0 2 Usf and 0 2 Usf such that J( 0; ) J( 0; 0) J( ; 0) 8 ; 2 Usf : (3.3) for xed x(0). The following remark is crucial to the equivalence between this problem and the induced 2-norm synthesis problem. Remark 3.7. If either 0 or 0 is a xed feedback law in Usf , then choosing the other input in Usf results in well-de ned signals for the state and both players for any given initial state. The signals will be in `2 if the map from the inputs to the state is a bounded linear map. So without loss of generality we can replace strategies and in equation (3.3) with signals u and w in `2 for the linear case. This is expressed by the following proposition, where we abuse notation slightly to write J as a function of both strategies and signals. Proposition 3.8. The state feedback strategies 0; 0 2 Usf satisfy J( 0; ) J( 0; 0) J( ; 0) 8 ; 2 Usf if and only if they satisfy J( 0; w) J( 0; 0) J(u; 0) 8u; w 2 `2 Proof. We prove only one side of the inequality, and the other side follows similarly. That is, we show that, given x0 2 Rn, 0 2 Usf and c 2 R, then J(x0; 0; ) c for all 2 Usf if and only if J(x0; ; w) c for all w 2 L2. But this is now obvious; by connecting an arbitary time varying state feedback around any given system we can span the entire input space to the system. Note that it is assumed that 0 and 0 are members of Usf . Then we can write the lower and upper values of the game as J = inf 2Usf sup w2`2 J(x(0); ; w) J = sup 2Usf inf u2`2 J(x(0); u; ) so the upper value of the original di erential game, where both players had full state information, is equal to the upper value of the game in which only the u player has full state information. Since with w = 0, the function J(x(0); ; w) 0 for all , if for some 2 Usf the function J(x(0); ; w) is bounded with respect to w, then the upper value of the game is well de ned. In which case, with the minimax controller 0 in place J = sup w2`2 J(x(0); 0; w) = sup w2`2nkzk22 2kwk22o: 3.3 Discrete time dynamic games with full state information 16 In order to consider the induced norm of the closed loop system, de ned by kTzwk := sup w2`2 w 6=0 kzk22 kwk22 ; we must restrict the initial state of the system to be zero, that is x(0) = 0. It is easy to see that kTzwk < if and only if there exists " > 0 such that kzk22 kwk22 2 " for all w 2 `2: This holds if and only if J(0; ; w) "kwk22 for all w 2 `2: Similarly, the nonstrict bound kTzwk is achieved if and only if kzk22 2kwk22 0 for all w 2 `2: This is equivalent to the condition that the feedback strategy for the u-player be such that J(0; ; w) 0 for all w 2 `2. Clearly there will exist such a state feedback strategy if and only if inf 2Usf sup w2`2 J(0; ; w) 0 Hence the induced norm of the closed loop system, kTzwk, is less than or equal to if and only if the upper value of the game is less than or equal to zero. This same reasoning does not extend to the strictly suboptimal case, however. Obviously, for an observable system with x0 6= 0, the induced norm will not be nite, and the upper value of the game will therefore be strictly positive. Remark 3.9. Note that the upper value can satisfy these criteria, and hence the strategy 0 be a solution to the suboptimal H1 problem, without there existing a saddle point for the di erential game. That is, the lower value might be di erent from the upper value. Remark 3.10. The minimization of the induced norm is the motivation behind the H1 synthesis problem. However, we can still formulate the di erential game problem when the initial state is non-zero. This in fact corresponds to synthesis problems which take account of the transient behaviour of the system also, and we will in the sequel study the more general case when the initial state is non-zero. Remark 3.11. In the in nite horizon case, or in the linear case with measurement feedback where G22 is not strictly proper, it is possible that the resulting di erential equations describing the dynamics of the closed-loop do not have a well de ned solution, given 0 2 Usf , for all possible choices of w 2 `2. In this case we need to impose the further restriction that the controller is internally stabilising, and results in a well-posed closed loop. 3.3 Discrete time dynamic games with full state information 17 Written as it is above, the game is simply a static game, where the possible choices for u are de ned by a class of functions. We might hope to be able to regard this as simply a static game with u and w both in `2. However, the existence of inf u2`2 sup w2`2 J(x(0); u; w) is not necessary for the existence of inf 2Usf sup w2`2 J(x(0); ; w): Hence we cannot necessarily achieve the same value of for these problems. Also, if u 2 L2 is xed, then the problem loses the induced norm interpretation, since kzk will be non zero even for zero w, in which case kTzwwk2 kwk2 is unbounded with respect to w. Rather than viewing the above problem as a static game over function spaces, it is advantageous to view it as a sequence of static games over nite dimensional spaces Rm2 and Rm1 , where u(i) 2 Rm2 and w(i) 2 Rm1 . In this case, each successive game depends on the outcome of the previous one, and the functional dependence of u on the state is regarded as the information available to u at each time step about the previous actions of the players. Note that the future dynamics and hence future cost of the game can be determined entirely from the current state and future actions of the players. We can then specify other classes of problem by describing the information available to the players at any given time step, rather than describing the set of functions in which u or w must belong. Using the standard arguments of dynamic programming, it is easy to see that inf 2Usf [k;K] sup 2Usf [k;K] K Xi=knjz(i)j2 2jw(i)j2o = inf u(k)2Rm2 sup w(k)2Rm1 z(k)0z(k) 2w(k)0w(k) + inf 2Usf [k+1;K] sup 2Usf [k+1;K] K X i=k+1njz(i)j2 2jw(i)j2o (3.4) Since the last term on the right hand side of equation (3.4) is dependent solely on x(k + 1), we may de ne the value function as V k; x(k) := inf 2Usf [k;K] sup 2Usf [k;K] K Xi=knjz(i)j2 2jw(i)j2o and write down the Isaacs equation: V k; x(k) = inf u(k) sup w(k) z(k)0z(k) 2w(k)0w(k) + V k + 1; x(k + 1) (3.5) 3.4 Continuous time dynamic games with full state information 18 with boundary condition V (K + 1; x(K + 1)) = 0. Here we have an expression of the form V k; x(k) = inf u(k) sup w(k)F (x(k); u(k); w(k)) (3.6) hence the extremising u(k) and w(k) are functions of x(k). The Isaacs equation is a special form of Hamilton-Jacobi-Bellman equation, derived using the principle of dynamic programming. It is a recursion, and may be calculated backwards in time. This is what we would expect, since with full state feedback the current decision depends only on the current state and the future cost. Hence, at time K, we know that the future cost is zero, and so we can calculate exactly the optimal current inputs as a function of the current state. Equation (3.5) was derived in the early 1950's by Rufus Isaacs [35] in continuous time. The following theorem can be found in [8]. Theorem 3.12. Suppose there exists a function V k; x(k) satisfying (3.5) for 0 k K, and at each k there exists a unique saddle point solution to the static game described by equation (3.6). Then the state feedback strategies derived from this solution are a state feedback saddle point solution for the above dynamic game, and the value of the game is given by V 0; x(0) . If the conditions of the theorem hold, then we can write the saddle point controller for the minimizing player as 0(k; x(k)) := arg inf u(k) sup w(k) z(k)0z(k) 2w(k)0w(k) + V k + 1; x(k + 1) (3.7) 3.4 Continuous time dynamic games with full state information Many of the arguments of the previous section regarding the correspondence of strategies and signals apply equally to this section, in particular Remark 3.7. We rst describe the general results from dynamic programming in the nonlinear case, and then apply them to the linear problem. 3.4.1 Nonlinear continuous time dynamic games We consider nite dimensional nonlinear systems as follows: _ x(t) = f(t; x(t); u(t); w(t)) x(0) = x0 z(t) = g(t; x(t); u(t)) The state x(t) 2 Rn, the disturbance w 2 L m1 2 [0; tf ], and the control input u 2 L m2 2 [0; tf ]. Let U n;k sf [t; tf ] := nv : [t; tf ] Rn ! Rk; 8 x 2 L n 2 [t; tf ]; v( ; x( )) 2 L k 2 [t; tf ]o be the space of all continuous time state feedback strategies on [t; tf ], where ; 2 Usf . The cost function is J(x(0); ; ) = Z tf 0 z(t)0z(t) 2w(t)0w(t) dt+ x(tf )0Qfx(tf ) 3.4 Continuous time dynamic games with full state information 19 Here Qf is a symmetric positive semide nite weighting matrix on the nal state. We would like to nd a saddle point for the problem inf 2Usf sup 2Usf J(x(0); ; ): Let t1 2 (0; tf), then inf 2Usf [0;tf ] sup 2Usf [0;tf ] Z tf 0 z(t)0z(t) 2w(t)0w(t) dt+ x(tf )0Qfx(tf ) = inf 2Usf [0;t1] sup 2Usf [0;t1] Z t1 0 z(s)0z(s) 2w(s)0w(s) ds + inf 2Usf [t1;tf ] sup 2Usf [t1;tf ]Z tf t1 z(t)0z(t) 2w(t)0w(t) dt+ x(tf )0Qfx(tf ) (3.8) As in the discrete time case, we can now de ne the value function as V (t; x(t)) = inf 2Usf [t;tf ] sup 2Usf [t;tf ]Z tf t z(s)0z(s) 2w(s)0w(s) ds+ x(tf )0Qfx(tf ): (3.9) Whereas in discrete time the value function satis ed a recursion, in continuous time it satis es a partial di erential equation. The continuous time counterpart of equation (3.5) is @ @tV (t; x(t)) = inf u(t) sup w(t) z(t)0z(t) 2w(t)0w(t) + @ @xV (t; x(t)) _ x(t) (3.10) with boundary condition V (tf ; x(tf )) = x(tf )0Qfx(tf ). The following theorem can be found in [7, 8]. Theorem 3.13. Suppose there exists a continuously di erentiable function V (t; x(t)) satisfying (3.10) for t 2 [0; tf ], and at each t there exists a saddle point solution to the corresponding static game in equation (3.10). Then the state feedback strategies derived from this solution are a full information saddle point solution for the above dynamic game, and the value of the game is given by V (0; x(0)). 3.4.2 The linear-quadratic game This game is referred to as \linear-quadratic" since it has linear dynamics and a quadratic cost function. We consider nite dimensional linear time varying systems as follows: _ x(t) = A(t)x(t) +B1(t)w(t) +B2(t)u(t) z(t) = C1(t)x(t) +D12(t)u(t) We make the simplifying assumptions that C1(t)0D12(t) = 0 and D12(t)0D12(t) = I for all t. The state x(t) 2 Rn, the disturbance w 2 L m1 2 [0; tf ], and the control input 3.4 Continuous time dynamic games with full state information 20 u 2 L m2 2 [0; tf ]. We will try to solve the Isaacs equation with a value function of the form V (t; x(t)) = x(t)0X1(t)x(t): (3.11) Then equation (3.10) becomes x(t)0 _ X1(t)x(t) = inf u(t) sup w(t) z(t)0z(t) 2w(t)0w(t) + x(t)0X1(t) A(t)x(t) +B1(t)w(t) +B2(t)u(t) + A(t)x(t) +B1(t)w(t) +B2(t)u(t) 0X1(t)x(t) ; which, by completing the square, becomes x(t) _ X1(t)x(t) = inf u(t) sup w(t) x(t)0nX1(t)A+ A0X1(t) + C 0 1C1 X1(t)(B2B0 2 2B1B0 1)X1(t)ox(t) + u(t) +B0 2X1(t)x(t) 2 2 w(t) 2B0 1X1(t)x(t) 2 Since this equation is true for all initial conditions of the system, we arrive at the well known Riccati di erential equation _ X1(t) = X1(t)A+ A0X1(t) + C 0 1C1 X1(t)(B2B0 2 2B1B0 1)X1(t) (3.12) with boundary condition X1(tf ) = Qf and the min-max solutions for u and w are 0(t; x(t)) = B2(t)0X1(t)x(t) (3.13) 0(t; x(t)) = 2B1(t)0X1(t)x(t) (3.14) Using Theorem 3.13, we see that this gives us a su cient condition for existence of a saddle point to the continuous time dynamic game. If there exists a solution to the Riccati di erential equation (3.12) on [0; tf ], then there exists a saddle point to the dynamic game given by equations (3.13) and (3.14). 3.4.3 Completion of squares We know that the above solution, since it is a saddle point, provides us with a solution to the H1 problem. We can also derive it directly as follows. Theorem 3.14. Suppose that equation (3.12) has a solution on [0; tf ]. Then J(x(0); ; ) = x(0)0X1(0)x(0) + ku 0k22 2kw 0k22 for any controller and any input w, and the di erential game has a unique saddle point solution at ( 0; 0). Here u and w are the signals generated by and respectively, and we abbreviate 0( ; x( )) by 0. If x(0) = 0, then with the controller u(t) = 0(t; x(t)) in place, the closed loop norm satis es kTzwk < . 3.4 Continuous time dynamic games with full state information 21 Proof. Clearly Z tf 0 d dtx0X1x dt = x(tf )0X1(tf)x(tf ) x(0)0X1(0)x(0) = x(tf )0Qfx(tf ) x(0)0X1(0)x(0) Hence J(x(0); ; ) = Z tf 0 z(t)0z(t) 2w(t)0w(t) dt+ x(tf )0Qfx(tf) = x(0)0X1(0)x(0) + Z tf 0 z0z 2w0w + d dtx0X1x dt = x(0)0X1(0)x(0) + Z tf 0 z0z 2w0w + _ x0X1x + x0X1 _ x + x0 _ X1x dt = x(0)0X1(0)x(0) + Z tf 0 z0z 2w0w + (Ax+B1w +B2u)0X1x + x0X1(Ax+B1w +B2u) + x0 _ X1x dt = x(0)0X1(0)x(0) + Z tf 0 x0(C 0 1C1 + A0X1 +X1A+ _ X1)x+ u0u 2w0w + (B1w +B2u)0X1x + x0X1(B1w +B2u) dt = x(0)0X1(0)x(0) + Z tf 0 x0X1(B2B0 2 2B1B0 1)X1x+ u0u 2w0w + (B1w +B2u)0X1x + x0X1(B1w +B2u) dt = x(0)0X1(0)x(0) + Z tf 0 (u+B0 2X1x)0(u+B0 2X1x) 2(w 2B0 1X1x)0(w 2B0 1X1x) dt = x(0)0X1(0)x(0) + ku u k22 2kw w k22 So with x(0) = 0 and u(t) = 0(t; x(t)), since J(x(0); ; ) = kzk22 2kwk22 + x(tf )0Qfx(tf ) then kzk22 2kwk22 = 2kw 0k22 x(tf )0Qfx(tf ): Since Qf 0, clearly kzk22 2kwk22 0, hence kTzwk . To show that this inequality is in fact strict, we need to show that there exists " > 0 such that kzk22 2kwk22 "kwk22 for all w 2 `2. With the controller in place, the closed loop dynamics satisfy _ x = (A B2B0 2X1)x +B1w w 0 = 2B0 1X1x+ w: 3.5 Continuous time games with imperfect measurement 22 This gives an invertible nite dimensional state space relationship between w and w 0, so the induced norm both of this operator and its inverse is nite. Therefore there exists > 0 such that kwk22 kw 0k22 8w 2 L2 and so the result follows. Note that this appears to contradict the possibility of w choosing w(t) = 0(t; x(t)) as a strategy, which would give kzk22 2kwk22 = x(tf )0Qfx(tf ) which, if either Qf = 0 or x(tf ) = 0 appears to imply that w can actually attain the induced norm bound. However, since 0(t; x(t)) = 0 and 0(t; x(t)) = 0 for x(t) = 0, with these strategies u, w and x are identically zero, hence z is identically zero also. 3.5 Continuous time games with imperfect measurement 3.5.1 Nonlinear systems In this section we will describe the results that have been proved for di erential games with partial information for nonlinear systems. The certainty equivalence principles that have been derived by various authors [78, 38, 7] do not depend on the linear character of the problem, and can be easily described in the nonlinear framework. We will then, in the next section, go on to specialize these results to the linear case. We will avoid describing the somewhat technical smoothness assumptions necessary to state the results precisely for general nonlinear systems, since in the rest of this thesis we will be concerned solely with linear systems. Consider the system described by the following nonlinear di erential equations on [t0; tf ] _ x(t) = f t; x(t); u(t); w(t) x(t0) = x0 y(t) = h t; x(t); w(t) (3.15) Again the problem is a dynamic game, where the minimizing player is now required to be a causal function of the measured output y. That is, let U p2;m2 of [t0; tf ] := : L p2 2 [t0; tf ]! L m2 2 [t0; tf ] [t0;t] [t0;t] = [t0;t] for all t 2 [t0; tf ] where [t0;t] is the usual projection of L2[t0; tf ] onto L2[t0; t]. The cost function will be J(x0; ; w) = (x0) + Z tf t0 g t; x(t); u(t); w(t) dt + x(tf ) and we would like to solve the di erential game problem inf 2Uof sup w2L2;x02Rn J(x0; ; w) Here we are considering the initial state x0 of the system as part of the unknown disturbance, that is as part of the maximizer's strategy. This has a direct interpretation 3.5 Continuous time games with imperfect measurement 23 in the linear case, as we shall describe in the next section. De ne the value of the game by V ( ; ) = inf 2Usf sup w2L2 Z tf g t; x(t); u(t); w(t) dt+ x(tf) subject to x( ) = and the dynamics described by equations (3.15). Then, subject to certain technical assumptions, V will satisfy @ @ V ( ; ) = inf u( ) sup w( ) g ; ; u( ); w( ) + @ @ V ( ; )f ; ; u( ); w( ) (3.16) with boundary condition V (tf ; x) = (x) 8x 2 Rn Whittle [78] describes the quantity V ( ; ) as the (partially extremized) future stress. This is exactly the Isaacs equation for the state feedback di erential game problem. Let u(t) = 0(t; x(t)) be the optimal state feedback controller. At time 2 [t0; tf ], the control law has been implemented as a signal on [t0; ]. Similarly, we have measurements of the output y on this interval. We will denote these signals by u and y respectively. Similarly let w be a possible past disturbance input. De ne the set of disturbances (x0; w ) consistent with past information u and y by 0( ) := (x0; w ) ; x0 2 Rn; w 2 L2[t0; ]; such that x(t0) = x0; _ x(t) = f t; x(t); u (t); w (t) y (t) = h t; x(t); w (t) for t 2 [t0; ] : and de ne those disturbances which are further consistent with the hypothesis that x( ) = by ( ; ) := (x0; w ) ; x0 2 Rn; w 2 L2[t0; ]; such that x( ) = ; x(t0) = x0; _ x(t) = f t; x(t); u (t); w (t) y (t) = h t; x(t); w (t) for t 2 [t0; ] : Now de ne the information state by P ( ; ) = sup (x0;w )2 ( ; ) (x0) + Z t0 g t; x(t); u (t); w (t) dt for all t 2 [t0; ]. The quantity P ( ; ) is the worst possible cost that could have been incurred so far if the current state were equal to , consistent with the information given by knowledge of u and y . It is shown in [38] that the function P satis es the Hamilton-Jacobi equation @ @ P ( ; ) = sup w ( ) y ( )=h( ; ;w ( )) g ; ; u( ); w( ) @ @ V ( ; )f ; ; u( ); w( ) (3.17) 3.5 Continuous time games with imperfect measurement 24 with the boundary conditionP (t0; x) = (x) 8x 2 Rn: James and Baras [38] describe P ( ; ) as an in nite dimensional state, and the above partial di erential equation can then be viewed as the state evolution equation for this new state. For each , the information state is a functional on Rn. Note that, for linear systems, the information state is a nonhomogeneous quadratic function of the , see [37, 78]. Whittle [78] calls this function the past stress. The certainty equivalence principle. The following theorem, the discrete version of which was rst formulated by Whittle [78], and which was proved in this form for the continuous time problem by James and Baras [38] gives us the desired certainty equivalence principle. Theorem 3.15. Suppose V is a smooth solution of the Isaacs equation (3.16) with the in mising saddle point solution for u given by 0, and P is a smooth solution of the Hamilton-Jacobi equation (3.17). Further, suppose that at each 2 [t0; tf ] there exists a unique maximizing x( ) such that x( ) = arg max x2Rn P ( ; x) + V ( ; x) then the certainty equivalence controller u( ) = 0( x( )); is minimizing for the game inf 2Uof sup w2L2 x02Rn J(x0; ; w): where 0 is the state feedback saddle point controller. This theorem is similar to one rst introduced by Whittle [78] for the discrete time linear-quadratic problem. It is proved for the continuous time nonlinear problem in [38, 40] using arguments from dynamic programming, and there are several stringent technical assumptions regarding smoothness that must be made. It says that, if we assume that the u player will play optimally using state feedback in the future, to give a future cost of V ; x( ) , and consider the worst possible current state consistent with past optimization as a state estimate, then using this state estimate in the state feedback law is optimal for the problem. In this sense this is a certainty equivalence principle; we are using a worst case state as if it were the actual state. It is also shown in [38] that the problem can be expressed in the following way. Let G( ; x0; u ; w ) = (x0) + Z t0 g t; x(t); u (t); w (t) dt+ V ( ; x( )): Also let W ( ; u ; y ) = sup (x0;w )2 0( )G( ; x0; u ; w ) 3.5 Continuous time games with imperfect measurement 25 Consider the function sup x2Rn P ( ; x) + V ( ; x) = = sup x2Rn sup (x0;w )2 ( ;x) (x0) + Z t0 g t; x(t); u (t); w (t) dt + V ; x( ) = sup x2Rn sup (x0;w )2 ( ;x) (x0) + Z t0 g t; x(t); u (t); w (t) dt+ V ; x( ) = sup (x0;w )2 0( )G( ; x0; u ; w ) The following theorem is stated in [7]. Theorem 3.16 (Ba sar and Bernhard [7]). Suppose V is a smooth solution of the Isaacs equation (3.16), and that 0 is the in mising state feedback control law for the u player. If, for every y and every 2 [t0; tf ], the problem sup (x0;w )2 0( ) (x0) + Z t0 g t; x(t); u (t); w (t) dt+ V ( ; x( )) has a unique maximum (x̂ 0; ŵ ) generating a state trajectory x̂ , then the control law u( ) = 0 ; x̂ ( ) is optimal for the di erential game with output feedback information, and its value is given by inf 2Uof sup x02Rn w2L2 J(x0; ; w) = max x02Rn V (t0; x0) + (x0) : This theorem can be interpreted as follows. At any time , suppose that the minimizing player u will play optimal state feedback in the future, and calculate the worst case disturbances w ; x0 which maximize the total cost. This worst case disturbance de nes a unique worst case state trajectory, which should be used in the state feedback law as if it were the actual state. In [7] the theorem is proved using techniques from dynamic programming. It is stated that the result holds not only for the linear quadratic case but also for nonlinear and nonquadratic problems. Note that in discrete time, the corresponding results to Theorem 3.15 and Theorem 3.16 are not equivalent, and the certainty equivalence controller may not be optimal. Bernhardsson [12, p. 66] gives a counterexample to the discrete time nonlinearnonquadratic case certainty equivalence principle. Also James [40] gives conditions for when the certainty equivalence principle is optimal in discrete time. The certainty equivalence principle above can be proved in the following way: Theorem 3.17. Suppose that ̂ 2 Uof is such that, for all t1; t2 2 [t0; tf ] with t2 t1, and for all y, W (t2; ut2 ; yt2) W (t1; ut1 ; yt1). That is, W is a decreasing function of when the signal u is de ned by the feedback law ̂. Then inf 2Uof sup x02Rn w2L2 J(x0; ; w) = max x02Rn V (t0; x0) + (x0) and the in mum is achieved by = ̂. 3.5 Continuous time games with imperfect measurement 26 Proof. Clearly J(x0; ̂; w) W (tf ; utf ; ytf ): Since W is decreasing, J(x0; ̂; w) W (t0; ut0 ; yt0) = max x02Rn V (t0; x0) + (x0) 8w 2 L2: Therefore, with this feedback strategy for the u player, sup x02Rn;w2L2 J(x0; ; w) max x02Rn V (t0; x0) + (x0) We therefore need to show that the for any other feedback strategy, the w player can make the cost function larger. That is, we wish to show that, for any strategy , there exists a disturbance w, x0 such that J(x0; ; w) max x02Rn V (t0; x0) + (x0) : Let J0(x0; ; w) = Z tf t0 g t; x(t); u(t); w(t) dt+ x(tf ) : Then for any 2 Uof , J0(x0; ; 0) V (t0; x0): (3.18) Here 0 2 Usf is the state feedback saddle point strategy for the maximizing player. To be strictly accurate, we know that equation (3.18) holds for all possible signals generated by the minimizing player, rather than all possible feedback strategies. However, as in Remark 3.7, with 0 in place de nes a unique u 2 L2, and J0(x0; ; 0) = J0(x0; u; 0). Since 0 is a saddle point strategy the inequality holds. Then, with = ̂, the feedback law 0 generates a unique signal ŵ 2 L2 for which the inequality is satis ed also. Hence max x02Rn J0(x0; ; ŵ) + (x0) max x02Rn V (t0; x0) + (x0) and therefore, for each 2 Uof , there exists a ŵ such that J(x0; ; ŵ) V (t0; x0) + (x0) and with = ̂ J(x0; ̂; w) max x02Rn V (t0; x0) + (x0) 8w 2 L2: Hence result. Ba sar and Bernhard [7] claim that, for general linear systems,W is a decreasing function of for all y given uniqueness of the maximum of G for each . In the sequel we will give an explicit formula for @W @ , and hence prove directly that the optimal state feedback with the certainty equivalence controller is the in mising solution to the upper value problem. 3.5 Continuous time games with imperfect measurement 27 3.5.2 The linear case We will consider the system described by _ x(t) = A(t)x(t) +B1(t)w(t) +B2(t)u(t) x(t0) = x0 z(t) = C1(t)x(t) +D11(t)w(t) +D12(t)u(t) y(t) = C2(t)x(t) +D21(t)w(t) +D22(t)u(t) on a nite time interval [t0; tf ]. Under the assumption that D12(t) and D21(t) have full column rank and full row rank respectively for all t 2 [t0; tf ], it is possible to show that we may transform the problem to an equivalent one in which D21(t)D21(t)0 = I D12(t)0D12(t) = I D11(t) = 0 D22(t) = 0: (3.19) That is, if we can construct a controller for the problem with the above assumptions, then we may construct a controller for the original problem also. Further, if the original system matrices are continuous functions of time, then the transformed matrices will also be continuous. For details of these loopshifting transformations, see for example [50, 29]. Also, following [22] we will make the additional standing assumptions that D21(t)B1(t)0 = 0 D12(t)0C1(t) = 0 (3.20) for all t 2 [t0; tf ]. These assumptions are not without some loss of generality, but their inclusion reduces the complexity of the resulting formulae, and removing them would introduce no signi cant technical di culties. Again the problem is a dynamic game, where the minimizing player is now required to be a causal function of the measured output y, that is we will require that u 2 Uof . By similar arguments to those used in the state feedback case, we can consider w to be a signal in L2 without loss of generality. It is also possible to consider x0 to be part of the unknown disturbance, that is, part of the maximizing player's strategy. We can then de ne the cost by J(x0; ; w) = 2x00Q0x0 + Z tf t0 z(t)0z(t) 2w(t)0w(t) dt+ x(tf )0Qfx(tf ) where Q0 > 0 and Qf > 0 are symmetric positive de nite weighting matrices. We would then like to solve the problem inf 2Uof sup w2L2;x02Rn J(x0; ; w) If we de ne the norms k(z; x(tf ))k2 := kzk22 + x(tf )0Qfx(tf ) k(w; x0)k2 := kwk22 + x00Q0x0 then if we choose 2 Uof such that J(x0; ; w) 0 8 w 2 L2; x0 2 Rn 3.5 Continuous time games with imperfect measurement 28 then k(z; x(tf ))k2 2k(w; x0)k2 8 w 2 L2; x0 2 Rn (3.21) Again, we can view this as an induced norm minimization, de ning T : L2 Rn ! L2 Rn by T : (w; x0) 7! (z; x(tf )), giving kT k () J(x0; ; w) 0 8 w 2 L2; x0 2 Rn This gives a formulation of H1 control taking transients into account, as described in [42]. Note that if kT k , then kTzwk also, since equation (3.21) holds for all x0, including x0 = 0. De ne G( ; x0; u ; w ) := 2x00Q0x0 + Z t0 z(t)0z(t) 2w(t)0w(t) dt+ V ( ; x( )) where V is the value of the game de ned by equation (3.9). Then the worst case trajectory is given by solving the following maximization problem W ( ; u ; y ) := sup (x0;w)2 0( )G( ; x0; u ; w ): (3.22) For the linear quadratic problem, if X1 exists satisfying equation (3.12), then V (t; x(t)) = x(t)0X1(t)x(t) from equation (3.11), where X1(t) is the solution to the control Riccati di erential equation (3.12) introduced previously. We now explicitly construct the worst case trajectory and disturbance. Lemma 3.18. Suppose there exists a unique maximum to the constrained optimization problem in equation (3.22). Then there exists a unique solution to the following two point boundary value problem, satis ed by the worst case trajectory x̂ and worst case disturbance (x̂ 0 ; ŵ ). _̂ x = Ax̂ + 2B1B0 1 +B2u x̂ (t0) = x̂ 0 _ = A0 + C 0 1C1x̂ + 2C 0 2(y C2x̂ ) ( ) = X1( )x̂ ( ) ŵ = 2B0 1 +D0 21(y C2x̂ ) x̂ 0 = 2Q 1 0 (t0) Proof. The proof is a standard application of the maximum principle to the constrained problem. Describe the system on [t0; ] by 0@x( ) z y 1A = 0@x0 u w 1A Then G( ; x0; u ; w ) = h 11x0 + 12u + 13w ; X1( )( 11x0 + 12u + 13w )i + h 21x0 + 22u + 23w ; 21x0 + 22u + 23w i 2hw ; w i 2hx0; Q0x0i 3.5 Continuous time games with imperfect measurement 29 Let be a Lagrange multiplier, then ~ G = G+ h ; y i h ; 31x0 + 32u + 33w i Write ŵ for the maximizing w , and x̂ for the corresponding state trajectory for the problem at time . Maximization leads to 1 2 33 + 23( 21x0 + 22u + 23ŵ ) + 11X1( )( 11x̂ 0 + 12u + 13ŵ ) 2ŵ = 0 and 1 2 31 + 21( 21x0 + 22u + 23ŵ ) + 11X1( )( 11x̂ 0 + 12u + 13ŵ ) 2Q0x̂ 0 = 0 We can rewrite these equations as 2ŵ = 1 2 33 + 23ẑ + 11X1( )x̂ ( ) 2Q0x̂ 0 = 12 31 + 21ẑ + 11X1( )x̂ ( ) (3.23) Let (t; r) be the transition matrix associated with A(t). Then de ne the operator on L2[t0; ] by ( z)(t) = Z t t0 (t; r)z(r) dr: Also de ne the operator : Rn ! L2[t0; ] by ( )(t) = (t; t0) and the linear operator F : L2[t0; ]! Rn by Fz = z( ): Then we can write = 24F F B2 F B1 C1 C1 B2 +D12 C1 B1 C2 C2 B2 C2 B1 +D2135 and = 24 F C 0 1 C 0 2 B0 2 F B0 2 C 0 1 +D0 12 B0 2 C 0 2 B0 1 F B0 1 C 0 1 B0 1 C 0 2 +D0 2135 Then let 0@ 1 2 31A = 0@ 1 2 31A 3.5 Continuous time games with imperfect measurement 30 and this operator has a realization given by _ (t) = A(t)0 (t) + C1(t)0 2(t) + C2(t)0 3 ( ) = 1 3(t) = B1(t)0 (t) +D21(t)0 3(t) 2(t) = B2(t)0 (t) +D12(t)0 2(t) 1 = (t0) From equations (3.23), we choose 1 = X1( )x̂ ( ) 1 = 2Q0x̂ 0 2 = ẑ 3 = 2ŵ 3 = 1 2 and hence the optimizing variables satisfy _ = A0 + C 0 1ẑ 12C 0 2 _̂ x = Ax̂ +B1ŵ +B2u 2ŵ = B0 1 1 2D0 21 ẑ = C1x̂ +D12u ( ) = X1( )x̂ ( ) (t0) = 2Q0x̂ 0 The constraint is y = C2x̂ +D21ŵ Using our assumption that D21D0 21 = I and that D21B0 1 = 0, combining this with the above equations, the result follows. Clearly, as it stands Lemma 3.18 is not such a useful result for computation. We would like to construct a di erential equation solution for the worst case state. As a rst step towards this, we convert the two point boundary value problem into an initial value problem. Let _ Y1 = AY1 + Y1A0 Y1(C 0 2C2 2C 0 1C1)Y1 +B1B0 1 Y1(t0) = Q 1 0 (3.24) and we repeat here the control Riccati equation _ X1 = A0X1 +X1A X1(B2B0 2 2B1B0 1)X1 + C 0 1C1 X1(tf ) = Qf (3.25) The following proposition will be very useful in the sequel. Proposition 3.19. Let X1 satisfy equation (3.25). Then X1(t) 0 for all t 2 [t0; tf ]. If either Qf > 0 or C 0 1C1 > 0, then X1(t) > 0 for all t 2 [t0; tf ]. Similarly if Y1 satis es (3.24), then Y1(t) 0 for all t 2 [t0; tf), and if either Q0 > 0 or B1B0 1 > 0 then Y1(t) > 0 for all t 2 [t0; tf ] 3.5 Continuous time games with imperfect measurement 31 Proof. Since X1(t) exists for all t 2 [t0; tf ], there exist a transition matrix for A (B2B0 2 2B1B0 1)X1=2. We can rewrite the Riccati equation as _ X1 = A0 1 2X1(B2B0 2 2B1B0 1) X1+X1 A 12(B2B0 2 2B1B0 1)X1 +C 0 1C1 and it is easy to show X1(t) = (tf ; t)0Qf (tf ; t) + Z tf t (s; t)0C 0 1(s)C1(s) (s; t) ds: Since is always invertible, the result for X1 follows. The result for Y1 follows by a similar argument. Applying Proposition 3.19 to equation (3.24) implies that Y1 > 0 for all t 2 [t0; tf ], since Q0 > 0, and similarly that X1 > 0 for all t 2 [t0; tf ] since Qf > 0. De ne x(t) := x̂ (t) 2Y1(t) (t): Di erentiation gives _ x = A x + 2Y1C 0 1C1 x + Y1C 0 2(y C2 x) +B2u and the initial condition is given by x(t0) = x̂ (t0) 2Y1(t0) (t0) = 2Q 1 0 (t0) 2Y1(t0) (t0) = 0 since we chose Y1(t0) = Q 1 0 From these equations it is clear that x(t) is independent of . Now we can replace the equation in by the equation in x, giving _̂ x = Ax̂ +B1B0 1Y1 1x̂ B1B0 1Y1 1 x+B2u _ x = A x + 2Y1C 0 1C1 x+ Y1C 0 2(y C2 x) +B2u x(t0) = 0 The boundary condition for x̂ is now given by x̂ ( ) = X 1 1 ( ) ( ) = 2X 1 1 ( )Y1 1( )(x̂ ( ) x( )) hence x̂ ( ) = I 2Y1( )X1( ) 1 x( ) (3.26) and the worst case disturbance is given by x̂ 0 = x̂ (t0) (3.27) ŵ = B0 1Y1 1x̂ B0 1Y1 1 x +D0 21(y C2x̂ ): (3.28) We can therefore express the worst case cost function (3.22) by W ( ; u ; y ) = Z t0 nx̂ 0C 0 1C1x̂ + u0u 2ŵ 0ŵ o dt + x̂ ( )0X1( )x̂ ( ) 2x̂ 0 0 Q0x̂ 0 (3.29) 3.5 Continuous time games with imperfect measurement 32 We now wish to consider the problem of when the constrained maximization problem de ned by equation (3.22) has a nite supremum. This problem is a constrained maximization of a nonhomogeneous quadratic functional with a constraint restricting w to an a ne subspace of L2. We will write it using the notation of Lemma 3.18 as follows. First note that 11 = ( ; t0), and is therefore always invertible. Hence, there exists a bijection between the pairs (x0; w ) and (x( ); w ). Therefore, G( ; x0; u ; w ) is bounded with respect to (x0; w ) if and only if it is bounded with respect to (x( ); w ), when x0 is replaced by 1 11 (x( ) 13w 12u ). We will therefore show that this latter problem is bounded. De ne = 2664 I 0 21 1 11 23 21 1 11 13 0 I 1 11 1 11 13 3775 so that 0BB@x( ) z w x0 1CCA = x( ) w : Then let = 2664X1( ) 0 0 0 0 I 0 0 0 0 2I 0 0 0 0 2Q03775 and we will write v = x( ) w where v is in the product space Rn L2. It is then easy to see that G( ; x0; u ; w ) = hv; vi+ < q; v > +c where q = 2664 0 22 21 1 11 12 0 1 11 12 3775 u and c is a constant depending on u only. We can also write the constraint that w and x0 be consistent with past measurements of y as v = e, where = 31 1 11 33 31 1 11 13 e = y + 31 1 11 12u 32u The maximization problem (3.22) can then be written W ( ; u ; y ) = sup v; v=e hv; vi+ < q; v > +c 3.5 Continuous time games with imperfect measurement 33 Lemma 3.20. Suppose there exists " > 0 such that hv; vi "kvk2 for all v such that v = 0. Then W ( ; u ; y ) is nite for all u and y . Proof. We assume that there exists v2 2 Rn L2 such that v2 = e. Then v = e if and only if there exists v1 2 ker such that v = v1 + v2. For any v 2 Rn L2, hv; vi+ hq; vi+ c = hv1; v1i+ hq; v1i+ hv2; v2i+ 2hv2; v1i+ hq; v2i+ c = hv1; v1i+ hq + v2; v1i+ d where d depends on e and u only. Hence if sup v12ker hv1; v1i+ hq + v2; v1i is nite, then so is W . Using the Cauchy-Schwartz inequality, we have hv1; v1i+ hq + v2; v1i hv1; v1i+ kq + v2kkv1k and, writing v1 = v0, where kv0k = 1, hv1; v1i+ hq + v2; v1i 2hv0; v0i+ kq + v2k sup v02ker ; kv0k=1 kq + v2k 2hv0; v0i since the previous equation is a quadratic in . Therefore, if hv0; v0i "kv0k2 for all v0 2 ker , then hv1; v1i+ hq + v2; v1i 1 2"kq + v2k for all v1 2 ker , and hence hv; vi+hq; vi+c is bounded for all v satisfying v = e. We can now make use of this Lemma. If we set y = 0 and u = 0, then e = 0 and q = 0. Hence, according to the Lemma, if there exists " > 0 such that G( ; x0; u ; 0) < "(jx( )j2 + kw k22) with x0 = 1 11 (x( ) 13w 12u ), for all x0 and w which satisfy x( ) w = 0, then W ( ; u ; y ) is nite for all y and u . This leads to the following result. Lemma 3.21. Suppose there exists a matrix function Y1 satisfying the Riccati di erential equation (3.24) on [t0; ], and a matrix function X1 satisfying the Riccati di erential equation (3.25) on [ ; tf ] such that X1( ) 2Y1 1( ) < 0. Then there exists a unique maximum to the constrained maximization problem de ned by equation (3.22). 3.5 Continuous time games with imperfect measurement 34 Proof. Applying Lemma 3.20, we need only consider the case when u = 0 and y = 0. Adding and subtracting 2x0Y1 1x to the cost function (3.29), and using (3.28) gives G( ; x0; u ; w ) = Z t0 n 2 d dt(x0Y1 1x) + u 0u 2w 0w + 2 2w 0B0 1Y1 1x 2x0Y1 1B1B0 1Y1 1x 2 2u 0B2Y1 1x + 2x0C 0 2C2xo dt + x( )0 X1( ) 2Y1 1( ) x( ): Since y = 0, for any t 2 [t0; ] we know that 0 = C2(t)x(t) + D21w(t). Then we can parametrize all w(t) satisfying this equation by w(t) = D0 21C2x(t) +D?0 21r(t) 8v(t) 2 Rm3 : where m3 < m1 if D21 6= I, and the set of v satisfying v = 0 is parametrized by x0 and r 2 L2. Hence, setting u = 0 as well, we can write the cost function as G( ; x0; 0; w ) = Z t0 2 D?0 21r(t) B0 1Y1 1x(t) 2 dt+ x( )0 X1( ) 2Y1 1( ) x( ) = Z t0 2 r(t) D? 21B0 1Y1 1x(t) 2 dt+ x( )0 X1( ) 2Y1 1( ) x( ): = 2kLrk22 + x( )0 X1( ) 2Y1 1( ) x( ) where L is given by L = A B1D?0 21 D? 21B0 1Y1 1 I : and its inverse has realization L 1 = A+B1D?0 21D? 21B0 1Y1 1 B1D?0 21 D? 21B0 1Y1 1 I : Since any nite dimensional linear time varying operator on a nite horizon is bounded, we know that krk kL 1kkLrk, and hence G( ; x0; 0; w ) 2 1 kL 1k2krk2 + x( )0 X1( ) 2Y1 1( ) x( ) Since X1( ) 2Y1 1( ) < 0, and is nite dimensional, it is automatically less than "I for some " > 0. The result then follows applying the previous Lemma. A direct proof of certainty equivalence. In the following Lemma we give an explicit construction of the rate of change of the worst case cost. This will allow us to immediately show that use of the certainty equivalence controller guarantees that W is decreasing with , and hence satis es the assumptions of Lemma 3.17. Lemma 3.22. If X1( ) 2Y1 1( ) < 0, then the unique maximum W ( ; u ; y ) for the problem described by equation (3.22) satis es d d W ( ; u ; y ) = 2 y( ) C2( ) x( ) 2 + u( ) +B2( )0X1( ) x( ) 2 3.5 Continuous time games with imperfect measurement 35 where x( ) = x̂ ( ). Hence, if u( ) = B2( )0X1( ) x( ) for all 2 [t0; tf ], then d d W ( ; u ; y ) 0 for all y . Further, x is the solution of _ x = A x + 2B1B0 1X1 x + Z1C 0 2(y C2 x) + 2Z1X1B2B0 2X1 x + Z1Y1 1B2u with initial condition x(t0) = 0. Proof. We present here a direct proof of this. Substituting into equation (3.29) from equation (3.28) gives W ( ; u ; y ) = Z t0 nx̂ 0C 0 1C1x̂ + u0u 2 B0 1Y1 1( x x̂ ) 2 2 y C2x̂ 2o dt + x̂ ( )0X1( )x̂ ( ) 2x̂ 0 0 Q0x̂ 0 We now add and subtract 2x0Y1 1x to give W ( ; u ; y ) = Z t0 n 2 d dt(x̂ 0Y1 1x̂ ) + x̂ 0C 0 1C1x̂ + u0u 2 B0 1Y1 1( x x̂ ) 2 2 y C2x̂ 2o dt + x̂ ( )0 X1( ) 2Y1 1( ) x̂ ( ) and after some manipulation, W ( ; u ; y ) = Z t0 nu 0u 2y 0y + 2 2u 0B0 2Y1 1x̂ 2 x0Y1 1B1B0 1Y1 1 x+ 2 2y 0C2x̂ o dt + x̂ ( )0 X1( ) 2Y1 1( ) x̂ ( ) Now note that d dt(x̂ 0Y1 1 x) = u 0B0 2Y1 1 x x0Y1 1B1B0 1Y1 1 x + y 0C2x̂ + x̂ 0Y1 1B2u Therefore, since x(t0) = 0, W ( ; u ; y ) = Z t0 n 2 2 d dt(x̂ 0Y1 1 x) + u 0u 2y 0y + 2 2u 0B0 2Y1 1x̂ 2 x0Y1 1B1B0 1Y1 1 x + 2 2y 0C2x̂ o dt + x̂ ( )0 X1( ) 2Y1 1( ) x̂ ( ) + 2 2x̂ ( )0Y1 1( ) x( ) 3.5 Continuous time games with imperfect measurement 36 From equation (3.26), this becomes W ( ; u ; y ) = Z t0 n 2 2 d dt(x̂ 0Y1 1 x) + u 0u 2y 0y + 2 2u 0B0 2Y1 1x̂ 2 x0Y1 1B1B0 1Y1 1 x+ 2 2y 0C2x̂ o dt x̂ ( )0 X1( ) 2Y1 1( ) x̂ ( ) and hence W ( ; u ; y ) = Z t0 nu 0u 2y 0y + 2 x0Y1 1B1B0 1Y1 1 x 2 2u 0B0 2Y1 1 xo dt x̂ ( )0 X1( ) 2Y1 1( ) x̂ ( ) Note that we had to make two separate substitutions here since equation (3.26) only holds at time t = . The advantage of this form for the worst case cost function is that all the terms within the integral are independent of , except at the endpoint of the interval. Hence d d W ( ; u ; y ) = d d nx̂ ( )0 2Y1 1( ) X1( ) x̂ ( )o + u( )0u( ) 2y( )0y( ) + 2 x( )0Y1 1( )B1( )B1( )0Y1 1( ) x( ) 2 2u( )0B2( )0Y1 1( ) x( ) De ne the matrix Z1(t) by Z1(t) = Y1(t)(I 2X1(t)Y1(t)) 1: (3.30) Note that, by the assumptions stated in the Lemma, Z1(t) is well de ned for t in some neighbourhood of , so Z1( ) satis es _ Z1 = (A+ 2B1B0 1X1)Z1 + Z1(A+ 2B1B0 1X1)0 Z1(C 0 2C2 2X1B2B0 2X1)Z1 +B1B0 1: Now let x( ) = x̂ ( ) for all . Then x = (I 2Y1X1) 1 x = Z1Y1 1 x Again note that x is independent of . Therefore _ x = A x + 2B1B0 1X1 x + Z1C 0 2(y C2 x) + 2Z1X1B2B0 2X1 x + Z1Y1 1B2u and d dt x0( 2Y1 1 X1) x = 2 x0( 2Y1 1 X1) 2B1B0 1X1 x+ 2 2 x0C 0 2(y C2 x) + 2 x0X1B2B0 2X1 x + 2 2 x0Y1 1B2u x0X1B2B0 2X1 x + 2 x0X1B1B0 1X1 x+ 2 x0C 0 2C2 x 2 x0Y1 1B1B0 1Y1 1 x 3.6 Discrete time games with imperfect measurement 37 Hence, d d W ( ; u ; y ) = 2jy( ) C2( ) x( )j2 + u( )0u( ) + 2u( )0B2( )0X1( ) x( ) + x( )X1( )B2( )B2( )0X1( ) x( ) which gives d d W ( ; u ; y ) = 2 y( ) C2( ) x( ) 2 + u( ) +B2( )0X1( ) x( ) 2 Corollary 3.23. Suppose that, for all t 2 [t0; tf ], there exist matrices X1(t) and Y1(t) satisfying equations (3.25) and (3.24), and that for all t 2 [t0; tf ] these solutions satisfy X1(t) 2Y1 1(t) < 0. Then de ne the feedback law ̂ by u(t) = B2(t)0X1(t) x(t): That is, it is the state feedback law with the worst case state estimate. Then sup x02Rn;w2L2 J(x0; ̂; w) = 0: Proof. This follows directly from Theorem 3.17 and the previous Lemma. Note that, since Y (t) > 0, the condition that X1(t) 2Y1 1(t) < 0 for all t 2 [t0; tf ] is equivalent to the spectral radius condition X1(t)Y1(t) < . This is straightforward from X1 2Y1 1 < 0 () Y 1 2 1X1Y 1 2 1 2I < 0 () (Y 1 2 1X1Y 1 2 1) < () (X1Y1) < : We have now presented a direct derivation of the certainty equivalence principle for continuous time linear time varying systems on a nite horizon. We will make use of this in the sequel in constructing moving horizon H1 control laws for linear time varying systems on an in nite horizon. 3.6 Discrete time games with imperfect measurement 3.6.1 The information state approach We consider nite dimensional nonlinear time varying systems on [k0; kf ] described by x(k + 1) = f(k; x(k); u(k); w(k)) x(k0)= x0 y(k) = h(k; x(k); w(k)) 3.6 Discrete time games with imperfect measurement 38 Here the state x(k) 2 Rn, the disturbance w 2 `m1 2 [k0; kf ], and the control input u 2 `m2 2 [k0; kf ]. Let the cost function be de ned by J(x0; ; w) := (x0) + kf X i=k0 g i; x(i); u(i); w(i) + x(kf + 1) ; (3.31) where the nal state weight is at time kf + 1 so that the nal inputs u(kf) and w(kf) are included in the cost function, and so that, if used in a moving horizon problem, the closed loop tends to be stable. We will now give a review of the work done by James, Baras and Elliott [40] on the information state approach to constructing solutions to the problem inf 2Uof sup w2`2 x02Rn J(x0; ; w) where the set of strictly causal discrete time output feedback strategies, Uof , is de ned by U p;m of [k0; k1] = : `p2[k0; k1]! `m2 [k0; k1] [k0;k] [k0;k 1] = [k0;k] for all k 2 [k0; k1] : De nition 3.24. De ne the Value of the game for k0 j kf by V j; x(j) := inf 2Usf [j;kf ] sup w2`2[j;kf ] kf Xi=j g i; x(i); u(i); w(i) + x(kf + 1) (3.32) The Value function is that used in the standard technique of dynamic programming, hence we do not add (x0) when j = k0. The following Lemma gives the dynamic programming result. Lemma 3.25. The value function V satis es the dynamic programming recursion V j; x(j) = inf u(j)2Rm2 sup w(j)2Rm1 g j; x(j); u(j); w(j) + V j + 1; x(j + 1) (3.33) with boundary condition V kf + 1; x(kf + 1) = x(kf + 1) (3.34) Proof. This is straightforward by induction. Again, assume that at time j, we have measurements of y and u on [k0; j 1], which we will denote by yj 1 2 `2[k0; j 1] and uj 1 2 `2[k0; j 1]. We now de ne the following disturbance class, which is the set of all possible (w; x0) disturbances consistent with yj and uj. 0(j) := n(x0; w) ; x0 2 Rn; w 2 `2[k0; kf ] such that x(i+ 1) = Ax(i) +B1w(i) +B2uj(i) yj(i) = C2x(i) +D21w(i) for k0 i jo 3.6 Discrete time games with imperfect measurement 39 and the following subset which is further consistent with the hypothesis that x(j+1) = , (j; ) := n(x0; w) ; x0 2 Rn; w 2 `2[k0; kf ] such that, for k0 i j; x(i + 1) = Ax(i) +B1w(i) +B2uj(i) yj(i) = C2x(i) +D21w(i) and x(j + 1) = o: Note that these disturbance sets actually depend on past values of u and y, although we omit them to reduce notation. De nition 3.26. De ne the information state P j; x(j) by P j; x(j) := sup (w;x0)2 (j 1;x(j)) (x0) + j 1 X i=k0 g i; x(i); u(i); w(i) (3.35) Note that P j; x(j) is actually a function of yj 1 and uj 1 also, but again we omit them from the arguments to reduce notation. We use the convention that the supremum over an empty set equals 1. It is easy to show by induction the following result. Lemma 3.27. The information state satis es the dynamic programming recursion P j + 1; x(j + 1) = sup x(j);w(j) x(j+1)=f(j;x(j);uj(j);w(j)) yj(j)=h(j;x(j);w(j)) nP j; x(j) + g j; x(j); u(j); w(j) o (3.36) with the boundary condition P k0; x0 = (x0). Write p(j) = P (j; ). That is, p(j) : Rn ! R. Then the above dynamic programming recursion can be written as p(j + 1) = F (p(j); u(j); y(j)) and this describes the evolution of the in nite dimensional `state' p. This new in nite dimensional system has `inputs' u and y. We shall show that p can be regarded as the state for a new dynamic game, in which case equation (3.36) is the evolution equation for this system. De ne the class Up[i; j] of strategies for the minimizing player which feedback the information state as the set of controllers of the form u(j) = p(i); : : : ; p(j) . Note that here is a functional. Each p(j) could be written as p(j) = P (j; ; uj 1; yj 1), dependent on the past inputs and measurements of the system. Therefore a controller in Up[k0; j] is also a member of Uof . We now de ne a cost function for the new game with state p. This cost function, unlike J in equation (3.31), will only have a nal state weight; the corresponding functions to g and will be zero. The cost function is J p(k0); ; y = B p(kf + 1) : 3.6 Discrete time games with imperfect measurement 40 where the nal weight on the information state is B(p(kf + 1)) = sup x(kf+1)nP kf + 1; x(kf + 1) + x(kf + 1) o: We have written B here to indicate the correspondence to the nal state weight for the original nite dimensional problem. We will de ne the new game with the in nite dimensional state p as inf 2Up [k0;kf ] sup y2`2[k0;kf ] J p(k0); ; y : Then, in exactly the same way as in De nition 3.24, we can de ne a Value function for the new game with this cost function. This new Value function is simply V j; p(j) := inf 2Up [j;kf ] sup y2`2[j;kf ]B p(kf + 1) (3.37) since there are no terms in p(i) for k0 i kf in this new cost function. Since there is only a nal term, the corresponding equation to (3.33) is very simple. It is V j; p(j) = inf u(j)2Rm2 sup y(j)2Rp2 V j + 1; p(j + 1) (3.38) and again this is straightforward to show by induction. Clearly, V j; p(j) is actually a function of yj 1 and uj 1 also. We then have the following useful Lemma. Lemma 3.28. The following equality holds: sup y2`2[j;kf ] J p(k0); ; y = sup (x0;w)2 0(j) J(x0; ; w): Proof. The left hand side is equal to sup y2`2[j;kf ] sup x(kf+1) sup (w;x0)2 (kf ;x(kf+1))n (x0) + kf X i=k0 g i; x(i); u(i); w(i) o + x(kf + 1) and the result follows, since the maximizations with respect to w and x0 are performed subject to constraints on w de ned by the values of y[j;kf ], and subject to x(kf + 1) being xed. But for every given w and x0, there exists a corresponding y and x(kf +1), and we are maximizing with respect to y[j;kf ] and x(kf + 1). Corollary 3.29. sup y2`2[k0;kf ] J p(k0); ; y = sup w2`2[k0;kf ] sup x02Rn J(x0; ; w) (3.39) Corollary 3.30. The function V k0; p(k0) satis es V k0; p(k0) = inf 2Uof sup w2`2[k0;kf ] sup x02Rn J(x0; ; w): 3.6 Discrete time games with imperfect measurement 41 Proof. The above statement follows from Corollary 3.29 when Uof is replaced by Up . However, since V satis es the dynamic programming recursion (3.38), the optimal minimizing the left hand side of equation (3.39) must be given by 2 Up, and hence the optimal for the original problem is also given by 2 Up. Since we know that V satis es the dynamic programming equation (3.38), if we can solve this recursion then we can nd u(j) at each j 2 [k0; kf ], as a function of p(j). That is, the resulting control law will be a feedback law dependent on the information state. This result is quite general. It holds for a wide class of nonlinear systems, and in principle gives us a method for recursively generating the optimal solution to the problem of minimizing J . The problem is that in general p(j) is described by an in nite dimensional evolution equation. Then V is a di erence equation de ned on a space of functions, which appears extremely di cult to solve at present. However, Ba sar and Bernhard [7], give a separation principle for this problem, which replaces the in nite dimensional recursion (3.38) for V by the nite dimensional recursion for V , the value of the game. Bernhardsson [12] gives a counterexample to this principle in the general nonlinear case. Following James [37], we shall now derive this separation principle from the information state formulation, and see that a particular saddle point condition is necessary for it to hold. Theorem 3.31. Suppose that, for all j 2 [k0; kf ], V j; p(j) = sup x(j)2RnnP j; x(j) + V j; x(j) o (3.40) Then let x(k) 2 arg max x(j)2RnnP j; x(j) + V j; x(j) o then u(j) = 0 j; x(j) is an optimal control law for the dynamic game problem de ned above, where 0 de ned by equation (3.7) is the optimal state feedback control law. Further, for each j 2 [k0; kf ], condition (3.40) holds if and only if inf u(j)2Rm2 sup x(j)2Rn sup w(j)2Rm1nP j; x(j) + g j; x(j); u(j); w(j) + V j + 1; x(j + 1) o = sup x(j)2Rn inf u(j)2Rm2 sup w(j)2Rm1nP j; x(j) + g j; x(j); u(j); w(j) + V j +1; x(j + 1) o (3.41) Proof. Let x x(j + 1); uj(j); w(j) = x(j) ; x(j + 1) = f j; x(j); uj(j); w(j) w x(j); yj(j) = w(j) ; yj(j) = h j; x(j); w(j) : 3.6 Discrete time games with imperfect measurement 42 Suppose condition (3.40) holds at time j + 1. Then V j + 1; p(j + 1) = sup x(j+1)2RnnP j + 1; x(j + 1) + V j + 1; x(j + 1) o = sup x(j+1)2Rn( sup x(j)2 x sup w(j)2 w P j; x(j) + g j; x(j); u(j); w(j) + V j + 1; x(j + 1) ) = sup x(j+1)2Rn sup x(j)2 x sup w(j)2 w P j; x(j) + g j; x(j); u(j); w(j) + V j + 1; x(j + 1) Since we are maximizing with respect to x(j) such that x(j+1) = f(j; x(j); uj(j); w(j)) and then maximizing with respect to x(j + 1), this implies V j + 1; p(j + 1) = sup x(j)2Rn sup w(j)2 w P j; x(j) + g j; x(j); u(j); w(j) + V j + 1; x(j + 1) Then from the dynamic programming recursion (3.38), V j; p(j) = inf u(j)2Rm2 sup y(j)2Rp2 sup x(j)2Rn sup w(j)2 w P j; x(j) + g j; x(j); u(j); w(j) + V j + 1; x(j + 1) Then, since we are maximizing with respect to w(k) such that y(j) = h(j; x(j); w(j)) and then maximizing with respect to y(j), this implies V j; p(j) = inf u(j)2Rm2 sup x(j)2Rn sup w(k)2Rm1 P j; x(j) + g j; x(j); u(j); w(j) + V j + 1; x(j + 1) If condition (3.40) holds at time j, then V j; p(j) = sup x(j)2RnnP j; x(j) + V j; x(j) o (3.42) = sup x(j)2Rn P j; x(j) + inf u(j)2Rm2 sup w(j)2Rm1ng j; x(j); u(j); w(j) + V j + 1; x(j + 1) o (3.43) = sup x(j)2Rn inf u(j)2Rm2 sup w(j)2Rm1 nP j; x(j) + g j; x(j); u(j); w(j) + V j + 1; x(j + 1) o 3.6 Discrete time games with imperfect measurement 43 Comparing this equation with the previous one proves the last part of the theorem. Further, we know that the maximizing x(j) in equation (3.42) is x(j), and the minimizing u(t) in equation (3.43) is given by u(j) = 0(j; x(j)) = 0(j; x(j)). 3.6.2 The linear case We now specialize the previous theory to the case where the system is linear, and the cost function is quadratic. Speci cally, we consider the system as described on [k0; kf ] by x(k + 1) = A(k)x(k) +B1(k)w(k) +B2(k)u(k) x(k0) = x0 z(k) = C1(k)x(k) +D12(k)u(k) y(k) = C2(k)x(k) +D21(k)w(k) Here the state x(k) 2 Rn, the disturbance w 2 `m1 2 [k0; kf ], and the control input u 2 `m2 2 [k0; kf ]. By similar arguments to those used in the continuous time case, we limit the system without loss of generality to the case when D11(k) = D22(k) = 0 for all k 2 [k0; kf ]. We also make the standard assumptions D21(k)D21(k)0 = I D12(k)0D12(k) = I D21(k)B1(k)0 = 0 D12(k)0C1(k) = 0: for all k 2 [k0; kf ]. Again, let the cost function be de ned by J(x0; ; w) := 2x00Q0x0 + kf X i=k0njz(i)j2 2jw(i)j2o + x(kf + 1)0Qfx(kf + 1): where Qf > 0, Q0 > 0. It is straightforward to show that there exists a saddle point for the state feedback dynamic game if and only if the solution to the Riccati di erence equation ~ X1(k) = A(k)0 ~ X1(k + 1) 1 +B2(k)B2(k)0 2B1(k)B1(k)0 1A(k) + C1(k)0C1(k) (3.44) with boundary condition ~ X1(kf + 1) = Qf : satis es 2I B1(k)0 ~ X1(k)B1(k) > 0 for all k 2 [k0; kf ] (see for example [7]). Then the value function is given by V k; x(k) = x(k)0 ~ X1(k)x(k) and the optimal state feedback control law is given by u(k) = 0(k; x(k)) = B2(k)0 ~ X1(k + 1) 1 +B2(k)B2(k)0 2B1(k)B1(k)0 1A(k)x(k) Also, the information state is given by P k; x(k) = 2 x(k) x(k) 0 ~ Y1(k) 1 x(k) x(k) + (k) 3.6 Discrete time games with imperfect measurement 44 where (k) contains terms independent of x(k), (depending on x, yj 1 and uj 1), and ~ Y1(k) satis es the Riccati di erence equation ~ Y1(k + 1) = A(k) (k) 1A(k)0 +B1(k)B1(k)0 ~ Y1(0) = Q 1 0 with (k) = ~ Y1(k) 1 + C2(k)0C2(k) 2C1(k)0C1(k): and x(k + 1) = A(k) x(k) +B2(k)u(k) + A(k) (k) 1 2C1(k)0C1(k) x(k) + C2(k)0 y(k) C2(k) x(k) : A su cient condition for a saddle point. In the following Lemma, we give a simple su cient condition for the existence of a saddle point for a static quadratic game dened on nite dimensional spaces. We will make use of this to show that the linear discrete dynamic game with a quadratic cost function satis es the saddle point condition required in order to use the certainty equivalence principle. We will also use it in Chapter 4 when deriving the corresponding result for the multi-rate sampled-data problem. Lemma 3.32. Suppose J(u; w) is the cost function de ned by J(u; w) = u w 0 R11 R12 R0 12 R22 u w + 2u0r1 + 2w0r2 for the static game min u2Rm2 max w2Rm1 f(u; w) Then, if R11 > 0 and R22 > 0, the game has a unique saddle point. Proof. Since R11 > 0 and R22 > 0, for each xed u there exists a unique maximizing w denoted by w (u), and for each xed w there exists a unique minimising u denoted by u (w). By di erentiation, these satisfy R11u (w) +R12w + r1 = 0 R0 12u R22w (u) + r2 = 0 Suppose there exists u0, w0 such that u0 = u (w0) and w0 = w (u0), then these will satisfy J(u0; w) J(u0; w0) J(u; w0) for all u 2 Rm2 , w 2 Rm1 , and hence will be a saddle point solution. Conversely, if J(u0; w0) J(u; w0) for all u 2 Rm2 , then u0 = u (w0), and if J(u0; w) J(u0; w0) for all w 2 Rm1 , then w0 = w (u0). Hence u0, w0 are a saddle point for this game if and only if they are a solution to R11 R12 R0 12 R22 u0 w0 = r1 r2 Since R11 is nonsingular, the matrix on the left hand side of this equation nonsingular if and only if R22 R0 12R 1 11 R12 is nonsingular, which by the positive de niteness assumptions is true. Therefore there exists a unique saddle point solution. 3.6 Discrete time games with imperfect measurement 45 We now wish to check that, for the discrete time linear quadratic problem, the conditions of Theorem 3.31 hold. We have g k; x(k); u(k); w(k) + V k + 1; x(k + 1) = jz(k)j2 2jw(k)j2 + x(k + 1)0 ~ X1(k + 1)x(k + 1) = 0@x(k + 1) z(k) w(k) 1A0 24 ~ X1 0 0 0 I 0 0 0 2I350@x(k + 1) z(k) w(k) 1A = 0@w(k) u(k) x(k)1A0 T 0 24 ~ X1 0 0 0 I 0 0 0 2I35T 0@w(k) u(k) x(k)1A = 0@w(k) u(k) x(k)1A0 24B0 1 ~ X1B1 2I B0 1 ~ X1B2 B0 1 ~ X1A B0 2 ~ X1B1 B0 2 ~ X1B2 + I B0 2 ~ X1A A0 ~ X1B1 A0 ~ X1B2 A0 ~ X1A350@w(k) u(k) x(k)1A where the system matrices A, B1, B2, C1, D12 are evaluated at time k, and ~ X1 is evaluated at k + 1, and we have written T = 24B1 B2 A 0 D12 C1 I 0 0 35 : We wish to check that the problem inf u(k)2Rm2 sup x(k)2Rn sup w(k)2Rm1nP k; x(k) +g k; x(k); u(k); w(k) +V k+1; x(k+1) o (3.45) has a saddle point in u(k), x(k). Let J(k) = (B1(k)0 ~ X1(k + 1)B1(k) 2I) 1 then it is clear that the maximum with respect to w has a unique solution if and only if J(k) < 0 for all k 2 [k0; kf ]. Note that this is also a necessary condition for the existence of a saddle point in the state feedback problem also, see for example [7, Theorem 3.2]. If this condition holds, then let L = 24I JB0 1 ~ X1B2 JB0 1 ~ X1A 0 I 0 0 0 I 35 R11 R12 R0 12 R22 = B0 2 ~ X1B2 + I B0 2 ~ X1A A0 ~ X1B2 A0 ~ X1A+ C 0 1C1 B0 2 ~ X1B1 A0 ~ X1B1 J B0 1 ~ X1B2 B0 1 ~ X1A 3.6 Discrete time games with imperfect measurement 46 then, since P k; x(k) = 2 x(k) x(k) 0 ~ Y1(k) 1 x(k) x(k) + (k), with (k) independent of x(k), using the Schur complement formula, we arrive at sup w(k)2Rm1nP k; x(k) + g k; x(k); u(k); w(k) + V k + 1; x(k + 1) o = sup w(k)2Rm10@w(k) u(k) x(k)1A0 L0 24J 0 0 0 R11 R12 0 R0 12 R2235L0@w(k) u(k) x(k)1A 2 x(k) x(k) 0 ~ Y1(k) 1 x(k) x(k) (k) = u(k) x(k) 0 R11 R12 R0 12 R22 u(k) x(k) 2 x(k) x(k) 0 ~ Y1(k) 1 x(k) x(k) (k): Then by the previous Lemma, the minimax problem in equation (3.41) has a saddle point if R11 > 0 R22 2 ~ Y1(k) 1 < 0 and algebra gives R11 = B0 2 ~ X1(k + 1)B2 + I B0 2 ~ X1(k + 1)B1J(k) 1B0 1 ~ X1(k + 1)B2 (3.46) R22 = A0 ~ X1(k + 1)A+ C 0 1C1 A0 ~ X1(k + 1)B1J(k)B1 ~ X1(k + 1)A (3.47) Since J(k) < 0, the rst condition that R11 > 0 is automatically satis ed. The second condition that R22 2 ~ Y1(k) 1 < 0 is a generalization of the usual coupling condition that occurs in the case when u(k) is allowed to depend on measurements of y(k) also. In the latter case, the condition is that ~ X1(k) 2 ~ Y1(k) 1 0 for all k 2 [k0; kf ]. In this case, we require that u is a strictly causal function of y. Condition (3.47) is in fact stronger than the standard condition, and implies that condition. If B2(k) were equal to zero, then condition (3.47) would be equivalent to the standard condition at time k. This is exactly what we expect for the one step delayed measurement problem, since during that one step u has no control, and so the problem during that time step is simply a quadratic maximization with respect to w. This kind of condition is repeated throughout the time domain derivations of various measurement problems, notably the sampled-data problem, when a necessary condition is that the maximum with respect to w of the cost function in between samples is bounded. If these conditions are satis ed, we can calculate x using x(k) 2 arg max x(j)2RnnP j; x(j) + V j; x(j) o and we know P j; x(j) + V j; x(j) = 2(x(k) x(k))0 ~ Y1(k) 1(x(k) x(k)) + (k) + x(k)0 ~ X1(k)x(k) = x x 0 2 ~ Y 1 1 + ~ X1 2 ~ Y 1 1 2 ~ Y 1 1 ~ Y 1 1 x x + = x x 0 T 0 ~ X1 2 ~ Y 1 1 0 0 2 ~ Y 1 1 2 ~ Y 1 1 ( 2 ~ X1 ~ Y 1 1 ) 1 ~ Y 1 1 T x x 3.7 Summary 47 where T = I ( 2 ~ Y1 ~ X1 I) 1 0 I By assumption, condition (3.47) is satis ed, and hence ~ X1 2 ~ Y 1 1 < 0. Therefore this maximization problem has a unique solution, given by x(k) = (I 2 ~ Y1(k) ~ X1(k)) 1 x(k) and the optimal controller is therefore u(k) = 0(k; x(k)) = B2(k)0 ~ X1(k + 1) 1 +B2(l)B2(k)0 2B1(k)B1(k)0 1 :A(k)(I 2 ~ Y1(k) ~ X1(k)) 1 x(k) 3.7 Summary In this chapter we began with the basic results of static games, and then showed that nite horizon induced 2-norm problems can be cast into the dynamic game formulation, which can be regarded as a sequence of coupled static games in the discrete case, and a di erential equation dependent on a dynamic game in the continuous time case. We have given an overview of the certainty equivalence principle derived by Whittle [78] and Ba sar and Bernhard [7], and given a new explicit derivation of this for the linear-quadratic continuous time nite horizon output feedback H1 problem. In the discrete problem, we have used the information state ideas of James, Baras and Elliott [40] to construct a solution to the one-step-delayed output feedback problem. This derivation is not new, but the ideas involved will be used heavily in Chapter 4 to construct solutions to the multi-rate sampled-data problem. 4. MULTI-RATE SAMPLED-DATA H1 CONTROL 4.1 Motivation In this chapter we will consider the disturbance rejection problem for multi-rate sampleddata systems. By sampled-data systems we mean systems where a continuous signal from a plant is sampled to give measurements at discrete intervals. Further, control is exerted on the plant through a discrete output from the controller, which is then converted to a continuous signal using a hold device. The requirement for sampled-data systems arises naturally in engineering problems in two main ways. The rst is through the desire to implement modern controllers on digital hardware. This means that the controller has to be implemented as a discrete algorithm. Often, the assumption is made that by sampling at a fast enough rate it is possible to simply approximate a desired continuous controller with such a discrete controller. This natural assumption has been shown to be justi ed in practice, and is also given some theoretical justi cation by [41]. However, we aim in this chapter to give a systematic design procedure for such systems. That is, we wish to know exactly how fast it is necessary to sample in order to achieve some speci ed level of performance. The second major justi cation for the sampled-data approach to controller design is that some processes only provide discrete information to the controller. A simple example is that of a chemical process, where in order to nd out how far a particular reaction has progressed, it is necessary to remove a sample of the reactants and perform some kind of analysis on it. This would normally provide information only at discrete intervals. Similarly, in biomedical applications such as anaesthetic control measurements such as heart-rate can only be obtained discretely. Again, for such a system we are looking for a systematic method to design robust controllers. Several authors have given synthesis techniques for construction of H1 suboptimal controllers for sampled-data systems, in particular solutions to the in nite horizon time invariant problem are given by Hara and Kabamba [31] and Bamieh and Pearson [3], by constructing equivalent discrete time systems. Further simpli cations to this solution are given by Chen and Francis in [19]. These results are derived using lifting techniques, where the original hybrid system is transformed to one on a discrete operator-valued space. The nite horizon time varying problem has also been considered, notably by Ba sar and Bernhard [7]. Further solutions to this problem are considered in [6, 67, 64, 11].The above results all consider systems with single rate periodic sampling and hold devices synchronized with each other. In this chapter we will be considering sampleddata systems which are both multi-rate and asynchronous. In this case the controller is connected to the plant through multiple separate sampling and holding devices, which are assumed to be asynchronous. Further, we will not assume any periodicity of the sampling and hold operators. The motivation for this kind of synthesis result comes both from physical systems 48 4.1 Motivation 49 and technical reasons. One example of a physical system where the design of such a controller is a signi cant problem is in the control of automobile engines. In this case the system has a natural almost-periodic behaviour, and also di erent kinds of measurements and control are available at di erent timings in the engine cycle. Also, the rate of the engine is not xed, but varies according to a command signal from the driver. We give necessary and su cient conditions for the existence of a controller satisfying a given performance bound with given asynchronous sampling and hold devices, and give a realisation for such a controller if one exists. This allows a comparison of the achievable performance for systems with di erent sampling rates. G K H1 H2 H3 S1 S2 S3 y z u vw Figure 4.1: The Multi-rate Sampled-data Problem The general sampled-data setup we will use is shown in Figure 4.1. Again we will use as performance measure an induced 2-norm. It will be natural for us to consider some disturbances as discrete and others as continuous. Multi-rate systems have been considered by several authors, notably Chen and Qiu [20] and Voulgaris [76]. In these papers, the sampling and hold devices are assumed to have rational rates, and to be synchronized with each other, so that the system can be lifted up to either the fastest or the slowest commensurate period. We will not use lifting techniques in this chapter, and so we will avoid the need for overall periodicity, and hence shift invariance when lifted to some appropriate level, of the overall system. We will use the results of Chapter 3 here to derive solutions in the time domain, by separating the problem into the state feedback and estimation cases, and then showing that the separation theory holds for this problem. Firstly, we will describe a very general class of systems, that of hybrid systems with discrete jumps in the state vector. We will derive necessary and su cient conditions for existence of a state feedback controller for such systems. We then specialise these results to the multi-rate sampled-data problem, and derive an expression for the information state for the measurement feedback problem. We will then be able to recombine these 4.2 State feedback 50 results to derive a solution to the output feedback problem. 4.2 State feedback 4.2.1 Systems with jumps We will rst consider controller synthesis for a wide class of systems, those with discrete jumps in the state vector. Let t0 < t1 < < tf be a xed known sequence of times. We will consider systems of the form _ x = Â(t)x(t) + B̂1(t)w(t) t 62 ft0; t1; : : : ; tfg x(t 0 ) = x0 x(ti) = ~ A[ti]x(t i ) + B̂2[ti]~ u[ti] z(t) = Ĉ1(t)x(t) (4.1) for t0 t tf and ti 2 ft0; t1; : : : ; tfg. The system matrices Â, B̂1 and Ĉ1 are bounded functions of the continuous interval [t0; tf ], and ~ A, B̂2 are bounded discrete functions de ned on the set ft0; t1; : : : ; tfg. Here w is a continuous disturbance input, z is the continuous controlled output, and ~ u is a discrete control input. The state x is a right continuous function of time, but it may be left discontinuous with nite jumps at times ti. We will denote by x(t i ) the value of x(t) just before time ti; that is x(t i ) := lim">0;"!0 x(ti "). We will show that, if we wish to synthesize a controller for a sampled-data system, it is possible to express the continuous system combined with generalized sample and hold operators in this form. This class of systems was developed by Sun, Nagpal and Khargonekar [64], who used quite di erent techniques to construct solutions to H1 synthesis problems for particular special cases of this general form of system. In this chapter we solve explicitly the general problem of nding a state feedback controller ~ u[ti] = ti; x(t i ) such that the L2-induced norm from w to z is less than a prespeci ed level . We will then specialize this result to both single and multi-rate sampled-data systems, and go on to construct the controller in the measurement feedback case, using the information state theory of Chapter 3. 4.2.2 Single hold First, to illustrate the approach, we will consider synthesis with a single rate discrete input ~ u. De ne the space of causal and memoryless state feedback strategies by U n;p sf [ti; tj] := : fti; : : : ; tjg Rn ! Rp with the control signal de ned by ~ u[ti] = ti; x(t i ) for 2 Usf . Then we would like to nd the minimizing controller for the following dynamic game problem inf 2Usf sup w2L2 Z tf t0 z(t)0z(t) 2w(t)0w(t) dt+ x(t f )0Qfx(t f ) : 4.2 State feedback 51 where Qf is a positive semide nite weighting matrix. As in the purely discrete problem of Chapter 3, we can apply dynamic programming to give inf 2Usf [t0;tf ] sup w2L2 Z tf t0 z(t)0z(t) 2w(t)0w(t) dt+ x(tf )0Qfx(tf ) = inf ~ u[t0]2Rm2 sup w2L2[t0;t1] 2x00Q0x0 + Z t1 t0 z(t)0z(t) 2w(t)0w(t) dt + inf ~ u[t1]2Rm2 sup w2L2[t1;t2] Z t2 t1 z(t)0z(t) 2w(t)0w(t) dt+ + inf ~ u[tf 1]2Rm2 sup w2L2[tf 1;tf ] Z tf tf 1 z(t)0z(t) 2w(t)0w(t) dt + x(t f )0Qfx(t f ) : Then as before we can de ne the Value of the game V ti; x(t i ) = inf 2Usf [ti;tf ] sup w2L2[ti;tf ] Z tf ti z(t)0z(t) 2w(t)0w(t) dt+ x(t f )0Qfx(t f ): It is then straightforward to show by induction that the Value function satis es the dynamic programming recursion V ti; x(t i ) = inf ~ u[ti]2Rm2 sup w2L2[ti;ti+1] Z ti+1 ti z(t)0z(t) 2w(t)0w(t) dt+ V ti+1; x(t i+1) (4.2) with boundary condition V tf ; x(t f ) = x(t f )0Qfx(t f ): Note now that on each subinterval [ti; ti+1], the ~ u player has only open loop information, that is ~ u only has knowledge of x(t i ). Further, according to Remark 3.7 in Chapter 3, we need only consider w disturbances inL2. Therefore, each subproblem is tantamount to a static linear quadratic game. This recursion is thus very similar to that derived in the discrete H1 problem, only we have ~ u[ti] 2 Rm2 and w 2 L2[ti; ti+1]. We will now show by induction that the Value function has the form V ti; x(t i ) = x(t i )0X[ti]x(t i ) for some matrix function X. Clearly, this holds at time tf , since V tf ; x(t f ) = x(t f )0Qfx(t f ). Now assume that it holds at time ti+1. Theorem 4.1. Suppose that V ti+1; x(t i+1) = x(t i+1)0X[ti+1]x(t i+1). Then, if there exists a bounded solution to Riccati di erential equation _ Zi = Â0Zi + ZiÂ+ 2ZiB̂1B̂0 1Zi + Ĉ1Ĉ 0 1 Zi(ti+1) = X[ti+1] (4.3) on the interval [ti; ti+1], then the value at time ti is given by V (ti; x(t i )) = x(t i )0X[ti]x(t i ); 4.2 State feedback 52 where X[ti] = ~ A[ti]0Zi(ti) ~ A[ti] ~ A[ti]0Zi(ti)B̂2[ti] B̂2[ti]0Zi(ti)B̂2[ti] yB̂2[ti]0Zi(ti) ~ A[ti]: Further, one possible minimizing strategy for ~ u[ti] is given by ~ u[ti] = B̂2[ti]0Zi(ti)B̂2[ti] yB̂2[ti]0Zi(ti) ~ A[ti]x(t i ) (4.4) and the corresponding worst case disturbance w 2 L2[ti; ti+1] is given by w(t) = 2B̂1(t)0Zi(t)x(t): Proof. De ne the cost function on the subinterval [ti; ti+1] Js x(t i ); ~ u[ti]; w = Z ti+1 ti z(t)0z(t) 2w(t)0w(t) dt+ x(t i+1)0X[ti+1]x(t i+1) Then V ti; x(t i ) = inf ~ u[ti]2Rm2 sup w2L2[ti;ti+1]Js x(t i ); ~ u[ti]; w : (4.5) We rst solve the maximization problem. Straightforward completion of the square gives Js x(t i ); ~ u[ti]; w = x(ti)0Zi(ti)x(ti) Z ti+1 ti 2jw(t) 2B̂0(t)Zi(t)x(t)j2 dt (4.6) which gives the above expression for the worst case w. Note that, although this is expressed as a feedback law, for any xed ~ u[ti] it corresponds to a unique signal in L2. We then have to solve the minimization problem Js x(t i ); ~ u[ti]; w = inf ~ u[ti]2Rm2 sup w2L2[ti;ti+1]Js x(t i ); ~ u[ti]; w = inf ~ u[ti]2Rm2 x(ti)0Zi(ti)x(ti) = inf ~ u[ti]2Rm2 ~ A[ti]x(t i ) + B̂2[ti]~ u[ti] 0Zi(ti) ~ A[ti]x(t i ) + B̂2[ti]~ u[ti] = inf ~ u[ti]2Rm2 x(t i ) ~ u[ti] 0 ~ A[ti]0Zi(ti) ~ A[ti] ~ A[ti]0Zi(ti)B̂2[ti] B̂2[ti]0Zi(ti) ~ A[ti] B̂2[ti]0Zi(ti)B̂2[ti] x(t i ) ~ u[ti] : Since Qf 0, Proposition 3.19 implies that Zi(ti) 0, and hence applying the Schur complement formula gives ~ A[ti]0Zi(ti) ~ A[ti] ~ A[ti]0Zi(ti)B̂2[ti] B̂2[ti]0Zi(ti) ~ A[ti] B̂2[ti]0Zi(ti)B̂2[ti] = T S 0 0 B̂2[ti]0Zi(ti)B̂2[ti] T 0 where T = I ~ A[ti]0Zi(ti)B̂2[ti] B̂2[ti]0Zi(ti)B̂2[ti] y 0 I S = ~ A[ti]0Zi(ti) ~ A[ti] ~ A[ti]0Zi(ti)B̂2[ti] B̂2[ti]0Zi(ti)B̂2[ti] yB̂2[ti]0Zi(ti) ~ A[ti] and the result then follows. 4.2 State feedback 53 This result shows that, if V ti+1; x(t i+1) is quadratic in x, then so is V ti; x(t i ) , and it gives us a recursive formula for X[ti]. Hence by induction we can conclude that, if there exists a solution to the Riccati equation (4.3) on each interval [ti; ti+1], then V ti; x(t i ) is quadratic for all ti. If x(t 0 ) = 0, then V t0; x(t 0 ) = 0, and hence 0 = inf 2Usf sup w2L2 Z tf t0 z(t)0z(t) 2w(t)0w(t) dt+ x(t f )0Qfx(t f ) : Hence the state feedback controller de ned by equation (4.4) achieves the induced L2 norm bound kTzwk . We now derive the necessary conditions for existence of a bounded upper value for the dynamic game, and hence those for existence of a -feasible state feedback controller. 4.2.3 Necessary conditions for state feedback control We will make use of the following result, from Ba sar and Bernhard [7, Theorem 8.3]. Theorem 4.2. Let A, B1 and Q be bounded matrix functions on the interval [t0; tf ], and suppose Q(t) > 0 for all t 2 [t0; tf ]. Consider the Riccati equation _ S = A0S + SA+ 2SB1B0 1S +Q S(tf) = Qf : (4.7) De nê = inff~ ; for all > ~ the RDE (4.7) has a bounded solution on [t0; tf ]g: Then, for any xed x0 2 Rn, the quadratic optimization sup w2L2[t0;tf ]Z tf t0 x(t)0Q(t)x(t) 2w(t)0w(t) dt+ x(tf )Qfx(tf ): subject to the dynamics _ x = Ax+B1w x(t0) = x0 has a nite supremum if > ̂, and only if ̂. Theorem 4.3. Suppose there exists a control law ~ u[ti] = ti; x(t i ) for i = 1; : : : ; f which results insup w2L2 Z tf t0 z(t)0z(t) 2w(t)0w(t) dt+ x(t f )0Qfx(t f ) 0 for all w 2 L2[t0; tf ], with initial state x(t 0 ) = 0. De ne ̂ as the in mum of the set of such that the Riccati di erential equation (4.3) has a bounded solution on [ti; ti+1] for all i = 1; : : : ; f . Then ̂. Proof. Suppose that < ̂. Then, for some i, is strictly less than the in mum of the set of such that the Riccati di erential equation (4.3) has a bounded solution on [ti; ti+1]. Therefore, applying Lemma 4.2, we know that the maximization problem given by equation (4.5) is unbounded with respect to w for all x(ti). Hence the upper value of the game is unbounded. 4.3 Single-rate sampled-data control 54 4.3 Single-rate sampled-data control We now specialize the results of the previous section to the full state information sampled data problem, with a single zero-order hold input. Suppose the continuous time system has the form _ x(t) = A(t)x(t) +B1(t)w(t) +B2(t)u(t) x(t0) = x0 z(t) = C1(t)x(t) +D12(t)u(t) (4.8) on the interval [t0; tf ], with the system matrices bounded functions of time. We again make the following assumptions about the system matrices: D12(t)0C1(t) = 0 D12(t)0D12(t) = I for all t 0. The hold operator H : `m2 2 [0; f ]! L m2 2 [t0; tf ] is de ned by u(t) = ~ u[tk] 8 t 2 [tk; tk+1]): We may then write the following system with jumps describing the relationship between the discrete input ~ u, the continuous input w and the continuous output z. Let  = A B2 0 0 ~ A = I 0 0 0 B̂1 = B1 0 B̂2 = 0 I Ĉ1 = C1 D12 (4.9) then it is easy to see that the system _ p = Â(t)p(t) + B̂1(t)w(t) t 62 ft0; t1; : : : ; tfg p(t 0 ) = x0 p(ti) = ~ A[ti]p(t i ) + B̂2[ti]~ u[ti] z(t) = Ĉ1(t)p(t) (4.10) is a realization for the sampled-data system also. Then the following theorem is a straightforward consequence of Theorem 4.1. Theorem 4.4. Consider the system described by equations (4.10). Let X[tf ] = Qf 0 0 0 and suppose there exists a bounded solution to the Riccati di erential equation _ Zi = Â0Zi + ZiÂ+ 2ZiB̂1B̂0 1Zi + Ĉ1Ĉ 0 1 Zi(ti+1) = X[ti+1] on each interval [ti; ti+1] for 0 i < f , where X[ti] is de ned recursively by X[ti] = Zi11(ti) Zi12(ti)Zi 1 22 (ti)Zi21(ti) 0 0 0 for 0 i < f . Then the controller u(ti) = Zi22(ti) 1Zi12(ti)0x(ti) results in kTzwk . 4.3 Single-rate sampled-data control 55 Proof. All we need to do is apply the formulae of Theorem 4.1 to the above special case, which directly gives the above expressions. The only thing we have to show is that Zi22(ti) is invertible, so that we can replace the pseudo-inverse by the usual inverse. This holds because equation (4.6) implies p(ti)0Zi(ti)p(ti) Js p(t i ); ~ u[ti]; w = Z ti+1 ti z(t)0z(t) 2w(t)0w(t) dt+ p(t i+1)0X[ti+1]p(t i+1) and this holds for all ~ u[ti], w and p(t i ). Choosing w = 0 and p(ti) = 0, and using p(ti) = ~ A[ti]p(t i ) + B̂2[ti]~ u[ti] = x(ti) ~ u[ti] we have ~ u[ti]0Zi22(ti)~ u[ti] Z ti+1 ti z(t)0z(t) dt+ p(t i+1)0X[ti+1]p(t i+1) and since z(t)0z(t) = x(t)0C1(t)0C1(t)x(t) + u(t)0u(t) > 0 for all u 6= 0, this implies that Zi22(ti) > 0. We have now shown that, provided a solution to equation (4.3) exists on [ti; ti+1], then the upper value for the static game on this subinterval is bounded. Hence there will exist a solution to the the recursion (4.2) and a well de ned minimax control. However, we have not shown that this is a saddle point for the game. Clearly if on each subinterval the upper value exists, then the lower value, given by sup w2L2[ti;ti+1] inf ~ u[ti]2Rm2 Js x(ti); ~ u[ti]; w : is nite, since the quadratic terms in u and x are positive, which implies that the lower value is bounded below, and the lower value is bounded above by the upper value. However, in order for a saddle point to exist we need the lower value to equal the upper value. Note however that the existence of a saddle point is not necessary for the existence of a state feedback controller satisfying the induced 2-norm criterion. Generalized hold functions. A wide class of systems can be described in the form of equations (4.1). This clearly includes both discrete and continuous time systems as special cases. We have seen above that the sampled-data system, combined with a zeroorder hold, ts into this framework. It is easily possible to describe also sampled-data systems with any hold operator describable by a nite dimensional linear time varying system. For example, if given ~ u[ti], the hold operator with output u(t) on the interval [ti; ti+1] is represented by the continuous LTI system _ x2 = Fx2 x2(ti) = G~ u[ti] u(t) = Hx2(t) 4.4 Multi-rate controller synthesis 56 then if this hold operator is connected to the continuous time system (4.8), the corresponding system with jumps has coe cient matrices  = A B2H 0 F ~ A = I 0 0 0 B̂1 = B1 0 B̂2 = 0 G Ĉ1 = C1 D12H : 4.4 Multi-rate controller synthesis 4.4.1 State feedback with a multi-rate hold: A simple example In this section we proceed with an example of state feedback with a multi-rate hold, where the input consists of two channels u1 and u2, which can change as shown in Figure 4.2. We revert to the continuous time system description given by equations (4.8), hence the state x is now a continuous function, and we do not need to distinguish between x(t i ) and x(ti). In the next section we will solve the more general multi-rate state feedback sampled-data problem with an arbitrary number of inputs. In Figure 4.2, t0 t1 t2 tf u1 u2 Figure 4.2: Input signals generated by multi-rate hold u1 is xed on [t1; tf) and also on [ta; t1), for some ta < t0. Also, u2 is required to be xed on [t0; t2) and on [t2; tf). We therefore use ft0; t1; t2; tfg as the set of points on which to break up the interval. These choices can be summed up as At time Choose t0 u2 for the interval [t0; t2) t1 u1 for the interval [t1; tf ) t2 u2 for the interval [t2; tf ) (4.11) The rst step is to solve the last stage of the dynamic programming recursion (4.2), on the interval [t2; tf ]. Using the same technique as before, rst we solve the corresponding maximization problem sup w2L2[t2;tf ]Js x(t2); ~ u[t2]; w where Js x(t2); ~ u[t2]; w = Z tf t2 z(t)0z(t) 2w(t)0w(t) dt+ x(tf )0 Qf 0 0 0 x(tf ): 4.4 Multi-rate controller synthesis 57 If there exists a solution to the Riccati di erential equation _ Z = Â0Z + ZÂ+ 2ZB̂B̂0Z + Ĉ 0Ĉ Z(tf ) = Qf 0 0 0 on the interval [t2; tf ], partitioning ~ u[t2] into ~ u1[t2] ~ u2[t2] we have sup w2L2[t2;t1]J x(t2); ~ u[t2]; w = 0@x(t2) ~ u1[t2] ~ u2[t2]1A0 24Z11 Z12 Z13 Z 0 12 Z22 Z23 Z 0 13 Z 0 23 Z3335 (t2)0@x(t2) ~ u1[t2] ~ u2[t2]1A : We can use the Schur complement formula again to write 24Z11 Z12 Z13 Z 0 12 Z22 Z23 Z 0 13 Z 0 23 Z3335 = T 0 24 ~ Z11 ~ Z12 0 ~ Z 0 12 ~ Z22 0 0 0 Z 1 33 35T where T = 24 I 0 0 0 I 0 Z 1 33 Z 0 13 Z 1 33 Z 0 23 I35 ~ Z = ~ Z11 ~ Z12 ~ Z 0 12 ~ Z22 = Z11 Z12 Z 0 12 Z22 Z13 Z23 Z 1 33 Z 0 13Z 0 23 : We can now solve the minimization problem inf ~ u2[t2] sup w2L2[t2;tf ] Js x(t2); ~ u2[t2]; w and the minimizing ~ u2[t2] is given by ~ u2[t2] = Z 1 33 (t2) Z13(t2)0x(t2) + Z23(t2)0~ u1[t2] : As we might expect, the solution for ~ u2[t2] depends on the value of ~ u1[t2], which will have been chosen at time t1. In order to determine which value we should choose for ~ u1[t1], we need to solve the next step of the recursion on the interval [t1; t2]. The Value of the game at time t2 is given by V t2; x(t2); ~ u1[t1] = x(t2) ~ u1[t1] 0 ~ Z x(t2) ~ u1[t1] (4.12) and hence the nal weight for the next subproblem, and therefore also the boundary condition for the Riccati di erential equation on [t1; t2] is given by Z(t 2 ) = 24 ~ Z11 ~ Z12 0 ~ Z 0 12 ~ Z22 0 0 0 035 : Note that if we substitute the value of ~ u1[t1] into equation (4.12), this gives the value of the game from time t2 onwards. Also, if we need to calculate the value of the game at 4.4 Multi-rate controller synthesis 58 a time at which no hold operator becomes active, this can be done by simply omitting the optimization with respect to u, since u is xed at some previous hold time(s). We now know ~ u2[t2] in terms of the state x(t2) and the value of ~ u1[t2]. The next subproblem, in a similar way, will give ~ u1[t1] in terms of x(t1) and ~ u2[t1]. This chain of optimizations continues until we reach a time when none of the channels are constrained, that is when we have choice over all spatial components of u. This will occur at the initial time, but may also occur at later times also. Note also that when the controller is in place, at any time we need to choose some components of u all the other components of u have de nite values, rather than being functions of past states. Hence we can always calculate the components we need in terms of the current state. 4.4.2 The general multi-rate case for jump systems In this section, we will return to the problem of synthesizing state feedback controller for systems with jumps, as in Section 4.2.1. However, we will consider the multi-rate problem; the discrete inputs ~ u will be divided into `channels', each of which can have a hold operator that becomes active at di erent times. At each time ti, we will describe which channels of ~ u can be chosen in the following way. Let R[tk] 2 Rm2 m2 be a permutation matrix, that is, a nonsingular matrix in which each column contains exactly one element equal to 1, with all the other elements in that column zero. Then partition R[tk] as R[tk] = R?[tk] R[tk] : At time tk, we may partition ~ u[tk] as ~ u[tk] = R[tk]0 R?[tk]~ u[tk] R[tk]~ u[tk] : If we now make the restriction that ~ u must satisfy R?[tk]~ u[tk 1] = R?[tk]~ u[tk] then the choice of R determines the components of ~ u which may change at any time. Only those spatial components of ~ u[tk] in the kernel of R?[tk] may be chosen at time tk. We will use this construction to de ne the multi-rate property of the hold operator. Naturally, the controller will have knowledge of R[tk] at least each time tk, and in fact we shall see that this is all the knowledge of R that the controller needs in order to minimize the upper value of the dynamic game. We assume that R[t0] = I so that all the hold channels are well de ned at time t0. Then the Isaacs recursion becomes V ti; x(t i ); R?[ti]~ u[ti] = inf R[ti]~ u[ti] sup w2L2[ti;ti+1] Z ti+1 ti z(t)0z(t) 2w(t)0w(t) dt+ V ti+1; x(t i+1); R?[ti+1]~ u[ti+1] We now show, as in the single rate case, that V ti; x(t i ); R?[ti]~ u[ti] is a quadratic for all i, 0 i f , and give a recursive formula for it. 4.4 Multi-rate controller synthesis 59 Theorem 4.5. Suppose that V ti+1; x(t i+1); R?[ti+1]~ u[ti+1] = x(t i+1) R?[ti+1]~ u[ti+1] 0X[ti+1] x(ti+1) R?[ti+1]~ u[ti+1] : (4.13) Then, if there exists a bounded solution to Riccati di erential equation _ Zi = Â0Zi + ZiÂ+ 2ZiB̂1B̂0 1Zi + Ĉ1Ĉ 0 1 (4.14) with boundary condition Zi(ti+1) = I 0 0 R?[ti+1] 0X[ti+1] I 0 0 R?[ti+1] (4.15) on the interval [ti; ti+1], then the value at time ti is given by V ti; x(t i ); R?[ti]~ u[ti] = x(t i ) R?[ti]~ u[ti] 0X[ti] x(t i ) R?[ti]~ u[ti] where X[ti] = ~ A0Z ~ A ~ A0ZB̂2R?0 R?B̂0 2Z ~ A R?B̂0 2ZB̂2R?0 ~ A0ZB̂2R0 R?B̂0 2ZB̂2R0 (RB̂0 2ZB̂2R0)y RB̂2Z ~ A RB̂0 2ZB̂2R?0 (4.16) where all matrices are evaluated at time ti. Further, one possible minimizing strategy for R[ti]~ u[ti] is given by Ru = (RB̂0 2ZB̂2R0)y RB̂2Z ~ Ax(t i ) +RB̂0 2ZB̂2R?0R?u (4.17) and the corresponding worst case disturbance w 2 L2[ti; ti+1] is given by w(t) = 2B̂1(t)0Zi(t)x(t): Proof. We could now apply directly Theorem 4.1, since clearly the multi-rate problem for a system with jumps is in fact a special case of the original synthesis problem for systems with jumps. However, the direct proof is straightforward, and illustrates the relationship to the previous special case with two inputs, so we include it here. De ne the cost function on the subinterval [ti; ti+1] Js x(t i ); ~ u[ti]; w = Z ti+1 ti z(t)0z(t) 2w(t)0w(t) dt + x(t i+1) R?[ti+1]~ u[ti+1] 0X[ti+1] x(ti+1) R?[ti+1]~ u[ti+1] : (4.18) Then the Isaacs recursion becomes, at time ti, V ti; x(t i ); R?[ti]~ u[ti] = inf R[ti]~ u[ti] sup w2L2[ti;ti+1]Js x(t i ); ~ u[ti]; w : (4.19) 4.4 Multi-rate controller synthesis 60 As in the previous example, we rst solve the maximization with respect to w for xed ~ u[ti]. Straightforward completion of the square gives Js x(t i ); ~ u[ti]; w = x(ti)0Zi(ti)x(ti) Z ti+1 ti 2jw(t) 2B̂0(t)Zi(t)x(t)j2 dt (4.20) which gives the above expression for the worst case w. We then have to solve the minimization problem Js x(t i ); ~ u[ti]; w = inf R[ti]~ u[ti] sup w2L2[ti;ti+1]Js x(t i ); ~ u[ti]; w = inf R[ti]~ u[ti]x(ti)0Zi(ti)x(ti) = inf R[ti]~ u[ti] ~ A[ti]x(t i ) + B̂2[ti]~ u[ti] 0Zi(ti) ~ A[ti]x(t i ) + B̂2[ti]~ u[ti] = inf R[ti]~ u[ti] x(t i ) ~ u[ti] 0 ~ A[ti]0Zi(ti) ~ A[ti] ~ A[ti]0Zi(ti)B̂2[ti] B̂2[ti]0Zi(ti) ~ A[ti] B̂2[ti]0Zi(ti)B̂2[ti] x(t i ) ~ u[ti] : Since we only wish to minimize with respect to the channel of ~ u[ti] which we are allowed to change at time ti, we now separate ~ u into a sum of those channels which can be changed and those that cannot, using ~ u[ti] = R?[ti] R[ti] 0 R?[ti]~ u[ti] R[ti]~ u[ti] : Hence Js x(t i ); ~ u[ti]; w = inf R[ti]~ u[ti]0@ x(t i ) R?[ti]~ u[ti] R[ti]~ u[ti] 1A0W 0@ x(t i ) R?[ti]~ u[ti] R[ti]~ u[ti] 1A where W is given by W = 24I 0 0 R?[ti] R[ti] 35 ~ A0Zi ~ A ~ A0ZiB̂2 B̂0 2Zi ~ A B̂0 2ZiB̂2 24I 0 0 R?[ti] R[ti] 035 = 24 ~ A0Zi ~ A ~ A0ZiB̂2R?0 ~ A0ZiB̂2R0 R?B̂0 2Zi ~ A R?B̂0 2ZiB̂2R?0 R?B̂0 2ZiB̂2R0 RB̂0 2Zi ~ A RB̂0 2ZiB̂2R?0 RB̂0 2ZiB̂2R0 35 where all matrices are evaluated at time ti. Applying the Schur complement formula to perform the minimization directly gives the desired result. Note that, in the above formula for the minimax controller given by equation (4.17), we have R[ti]~ u[ti] given in terms of both x(t i ) and R?[ti]~ u[ti]. This is as we expect, since R?[ti]~ u[ti] has a xed value at time ti, since it has already been chosen at previous hold times and cannot be changed at time ti. 4.4 Multi-rate controller synthesis 61 Theorem 4.6. Suppose there exists a multi-rate state feedback control law given by ~ u[ti] = ti; x(t i ) satisfying the condition that R?[tk]~ u[tk 1] = R?[tk]~ u[tk], for i = 1; : : : ; f which results insup w2L2 Z tf t0 z(t)0z(t) 2w(t)0w(t) dt+ x(t f )0Qfx(t f ) 0 for all w 2 L2[t0; tf ], with initial state x(t 0 ) = 0. De ne ̂ as the in mum of the set of such that the Riccati di erential equation (4.15) with boundary conditions (4.14) has a bounded solution on [ti; ti+1] for all i = 1; : : : ; f . Then ̂. Proof. Suppose that < ̂. Then, for some i, is strictly less than the in mum of the set of such that the Riccati di erential equation (4.14) has a bounded solution on [ti; ti+1]. Therefore, applying Lemma 4.2, we know that the maximization problem given by equation (4.5) is unbounded with respect to w. Hence the upper value of the game is unbounded. 4.4.3 The general multi-rate case for sampled-data systems We now return to the problem of nding a state feedback sampled-data for the multirate problem. We will use the same system setup as in Section 4.3, with the multi-rate restriction of the previous section that ~ u must satisfy R?[tk]~ u[tk 1] = R?[tk]~ u[tk]: Again, we need only apply Theorem 4.5 to the special case of the system speci ed by equations (4.8). This gives the following result. Theorem 4.7. Consider the system described by equations (4.10). Let X[tf ] = Qf 0 0 0 and suppose there exists a bounded solution to the Riccati di erential equation _ Si = A0Si + SiA+ 2SiB1B0 1Si + C 0 1C1 Si(ti+1) = X11[ti+1] (4.21) on the interval [ti; ti+1]. Then there exists a bounded solution to the Riccati di erential equation _ Zi = Â0Zi + ZiÂ+ 2ZiB̂1B̂0 1Zi + Ĉ1Ĉ 0 1 Zi(ti+1) = X[ti+1] (4.22) on each interval [ti; ti+1] for 0 i < f , where X[ti] is de ned recursively by X[ti] = Zi11 Zi12R?0 R?Zi21 R?Zi22R?0 Zi12R0 R?Zi22R0 (RZi22R0) 1 RZi21 RZi22R?0 : (4.23) with all matrices evaluated at time ti. Then the multi-rate sampled-data state feedback controller R[ti]~ u[ti] = RZi22R0 1 RZi12x(ti)0 +RZi22R?0R?~ u[ti] (4.24) results in kTzwk . 4.5 One step delayed output feedback 62 Proof. We can explicitly write out equation (4.22) as _ Zi11 = A0Zi11 + Zi11A+ 2Zi11B1B0 1Zi11 + C 0 1C1 Zi11(ti+1) = X11[ti+1] _ Zi22 = B0 2Zi11 + Zi012B2 + I Zi22(ti+1) = R?0[ti+1]X12R?[ti+1] _ Zi12 = A0Zi12 + Zi11B2 + 2Zi11B1B0 1Zi12[ti+1] Zi12(ti+1) = X12[ti+1]R?[ti+1] using the standard fact (see for example [59]) that solutions to this form of Riccati differential equation are symmetric. The rst of these equations is simply equation (4.21), and by assumption this has a bounded solution. The next two are then linear equations with bounded time varying coe cients, and hence have well de ned solutions on any nite interval, see for example [30]. Therefore, existence of a solution to (4.21) is su cient for existence of a solution to (4.22). We now need to apply Theorem 4.5 to the special case of the system speci ed by equations (4.8). The expressions for the controller and X[ti] are given by direct substitution of equations (4.9) into equations (4.16) and (4.17). All that remains of the proof is to show that RZi22R0 is invertible, so that we can replace the pseudo inverse with the actual inverse. We can show that Zi22 is invertible in exactly the same was as in the proof of Theorem 4.4. Then, since R is full row rank, the result follows. 4.5 One step delayed output feedback 4.5.1 Problem formulation For the remainder of this chapter, we restrict attention to the sampled-data problem, rather than considering general systems with jumps, and concentrate on the multi-rate case. We will consider the system described by the following equations on the time interval [t0; tf ] R+. _ x(t) = A(t)x(t) +B1(t)w(t) +B2(t)u(t) x(t0) = x0 z(t) = C1(t)x(t) +D12(t)u(t) y(t) = C2x(t) The system matrices are assumed to be bounded functions of time. The times t0 t1 tf are those at which either a hold operator or a sampling operator becomes active. That is, at each tk, 0 k f , we either receive new information about some spatial components of y(tk), or we must decide some spatial components of the signal u on [tk; tk+j] for some j > 0. The spatial components here correspond to di erent `channels' in Figure 4.1. The set of possible choices of u will be determined by the matrix R as before. The discrete measured output is then ~ y[tk] = [tk](Sy)[tk] + [tk]D21[tk]v[tk] where [tk] is a square diagonal matrix whose elements are either 1 or 0, and the operator S : L p2 2 [t0; tf ]! `p2 2 [0; f ] is a sampling operator. The signal v is a discrete 4.5 One step delayed output feedback 63 noise input. If the ith component of [tk] is a 1, this indicates that the ith component of ~ y[tk] is available for measurement at that time. The actual sampling operator S need not be diagonal for the details here to work, although it clearly makes more physical sense that it be so. The sampling operator we will use here will be de ned by (Sy)[tk] = y(tk) Note that, since we are sampling the output of the system, and there is no direct feedthrough from w to y, this operation is well-de ned for L2 inputs u and w. Also, at each time tk, the controller knows [tk], and we shall see that the controller does not in fact need advance knowledge of . We again make the following assumptions about the system matrices: D12(t)0C1(t) = 0 D12(t)0D12(t) = I D21[tk]B0 1(tk) = 0 D21[tk]D21[tk] = I for all t and tk. The hold operator H : `m2 2 [0; f ]! L m2 2 [t0; tf ] is de ned by u(t) = ~ u[tk] 8 t 2 [tk; tk+1]: The disturbance has a discrete component v as well as a continuous component w, and we will also include the initial state x0 as part of the disturbance. We therefore consider the disturbance as an element of the product spaceL2 `2 Rn, and shall write the elements of this space as (w; v; x0). Then the norm of the disturbance (w; v; x0) is given by k(w; v; x0)k2 = kwk22 + kvk22 + x00Q0x0: where Q0 > 0 is some given weighting matrix. Let T : (w; v; x0) 7! z be the closed loop linear operator with the controller ~ u = ~ y. Our aim is to construct a causal controller such that kT k := sup (w;v;x0) kT (w; v; x0)k k(w; v; x0)k which occurs if and only if kzk22 2kwk22 2kvk22 2x00Q0x0 0 for all disturbances (w; v; x0). We therefore try to solve the di erential game inf 2Umrfb sup w2L2 v2`2 x02Rn 2x00Q00x0 + Z tf t0 z(t)0z(t) 2w(t)0w(t) dt 2 f 1 Xi=0 v[ti]0v[ti] + x(tf )0Qfx(tf ) (4.25) where Umrfb is the set of all causal multi-rate sampled-data controllers mapping ~ y to ~ u. 4.5 One step delayed output feedback 64 4.5.2 An expression for the information state We would now like to construct output feedback controllers for the sampled data problem using the information state approach. The development in this section parallels that in Section 3.6. The problem we shall solve is to nd the minimax controller for the above problem when ~ u[tj] depends on [ti]~ y[ti] for i strictly less than j. That is, the controller has access to past samples of the measured output, but not access to samples which occur at time tj. As before, the cost function is J(x0; ; w; v) = 2x00Q00x0 + Z tf t0 z(t)0z(t) 2w(t)0w(t) dt 2 f 1 Xi=0 v[ti]0v[ti] + x(tf )0Qfx(tf ): (4.26) It is possible to regard the sampled-data system as a discrete time system, using the technique of lifting. That is, at each time tj, we regard the signals w and z as taking values in L2[tj; tj+1]. This technique is used by several authors for both analysis and synthesis in sampled-data problems, when the system has single periodic synchronized sample and hold devices. We will not use this technique here, although it may make comparisons with Section 3.6 easier to see to regard w(j) and z(j) in that section as corresponding to w[tj ;tj+1] and z[tj ;tj+1] in this section. De ne the information state for the sampled-data problem by P tj; = sup w2L2[t0;tj ] sup v2`2[t0;tj 1] sup x02Rn 2x00Q0x0 + Z tj t0 z(t)0z(t) 2w(t)0w(t) dt 2 j 1 Xi=0 v[ti]0v[ti] subject to the constraints = x(tj) ~ y[ti] = [ti]C2(ti)x(ti) + [ti]D21[ti]v[ti] for all i = 0; : : : ; j 1: It is then easy to show by induction that P satis es the dynamic programming recursion P tj+1; = sup x(tj)2Rn sup w2L2[tj ;tj+1] sup v[tj ]2Rm3 P tj; x(tj) + Z tj+1 tj z(t)0z(t) 2w(t)0w(t) dt 2v[tj]0v[tj] subject to the constraints that = x(tj+1) _ x(t) = A(t)x(t) +B1(t)w(t) +B2(t)~ u[tj] ~ y[tj] = [tj]C2(tj)x(tj) + [tj]D21[tj]v[tj] 4.5 One step delayed output feedback 65 on the interval [tj; tj+1], and with the boundary condition P t0; x0 = 2x00Q0x0 Note that P tj; x(tj) is actually a function of ~ y[t0;tj 1] and ~ u[t0;tj 1] also, but we omit these arguments to reduce notation. We will now show by induction that P takes the form P tj; x(tj) = 2 x(tj) (tj) 0L(tj) x(tj) (tj) + c[tj] (4.27) and give recursive formulae for L and . In this expression c[tj] is independent of x(tj), and depends solely on y[t0;tj 1] and u[t0;tj 1]. Firstly, note that the above equation holds at time t0, with L(t0) = Q0, (t0) = 0 and c[t0] = 0. Now suppose it holds at time tj. Then P tj+1; = sup x(tj)2Rn sup w2L2[tj ;tj+1] sup v[tj ]2Rm3 2 x(tj) (tj) 0L(tj) x(tj) (tj) + Z tj+1 tj z(t)0z(t) 2w(t)0w(t) dt 2v[tj]0v[tj] + c[tj] subject to the constraints that = x(tj+1) _ x(t) = A(t)x(t) +B1(t)w(t) +B2(t)~ u[tj] ~ y[tj] = [tj]C2(tj)x(tj) + [tj]D21[tj]v[tj] on the interval [tj; tj+1]. We now solve this maximization problem directly. Denote the signal w on [tj; tj+1] by wj, and similarly for zj. Then we can write the system equations on [tj; tj+1] as 0@x(tj) zj ~ y[tj]1A = 0BB@x(tj+1) ~ u[tj] wj v[tj] 1CCA (4.28) Here is a linear operator between Cartesian product spaces. Then we can write P tj+1; = sup x(tj)2Rn sup wj2L2[tj ;tj+1] sup v[tj ]2Rm3 2h(x(tj) (tj)); L(tj)(x(tj) (tj))i+ hzj; zji 2hwj; wji 2hv[tj]; v[tj]i again subject to the above constraints. Then, using similar notation to that of Section 3.5.2, we can write the operator as = 24 F̂ ̂ F̂ ̂B2H F̂ ̂B1 0 C1̂ C1̂B2H +D12H C1̂B1 0 [tj]C2(tj)F̂ ̂ [tj]C2(tj)F̂ ̂B2H [tj ]C2(tj)F̂ ̂B1 [tj]D21[tj]35 : 4.5 One step delayed output feedback 66 where F̂ : L2[tj; tj+1]! Rn F̂ u := u(tj) ̂ : Rn ! L2[tj; tj+1] (̂ )(t) := (t; tj+1) ̂ : L2[tj; tj+1]! L2[tj; tj+1] (̂u)(t) := Z tj+1 t (t; s)u(s) ds H : Rn ! L2[tj; tj+1] (H )(t) := with (t; s) the transition matrix for A. We use B1 and B2 as multiplication operators also without distinguishing this. Since 11 = (tj+1; tj), and a transition matrix is always invertible, for every xed wj and v[tj] there is a bijection between the states x(tj) and x(tj+1). Therefore, xing x(tj+1) = , there is no need to maximize with respect to x(tj) in the expression for P . Therefore, P tj+1; x(tj+1) = sup wj2L2[tj ;tj+1] sup v[tj ]2Rm3 2h x(tj) (tj) ; L(tj) x(tj) (tj) i+ hzj; zji 2hwj; wji 2hv[tj]; v[tj]i : Introduce the Lagrange multiplier 2 Rp2, then let ~ P = 2hx(tj) (tj); L(tj) x(tj) (tj) i+ hzj; zji 2hwj; wji 2hv[tj]; v[tj]i + h ; ~ y[tj]i h ; 31x(tj+1) + 32~ u[tj] + 33wj + 34v[tj]i: We can now substitute the expressions for zj, and x(tj) from equation (4.28). On an inner product space, if the maximum of hv; Avi + hb; vi exists, then the optimal v satis es 2Av+ b = 0. Therefore, if we denote the maximizing, or worst case, inputs for this subproblem by ŵ and v̂, with corresponding trajectory written as x̂j, they must satisfy 2ŵ = 2 13L x(tj) (tj) 23zj + 1 2 33 2v̂ = 2 14L x(tj) (tj) 24zj + 1 2 34 : (4.29) We must now calculate the dual operator, . The following easily derived lemma will be useful for this purpose. Lemma 4.8. Consider the system _ x = Ax+Bq2 x(t1) = q1 p1 = x(t0) p2 = Cx+Dq2 and denote this operator by p1 p2 = q1 q2 then if we write the dual operation 1 2 = 1 2 4.5 One step delayed output feedback 67 then has realization _ = A0 C 0 2 (t0) = 1 1 = (t1) 2 = B0 +D0 2: Suppose that 0BB@ 1 2 3 41CCA = 0@ 1 2 31A : (4.30) Here 1 2 Rn, 2 2 Rm2 , 3 2 L2[tj; tj+1], 4 2 Rm3 , and 1 2 Rn, 2 2 L2[tj; tj+1], 3 2 Rp2 . In the same way as for the above lemma, the dual operator has realization _ j(t) = A(t)0 j(t) C1(t)0 2(t) j(tj) = 1 + C2(tj)0 [tj] 3 1 = j(tj+1) 2 = H B2(t)0 j(t) +H D12(t)0 2(t) 3(t) = B1(t)0 j(t) 4 = D21[tj]0 [tj] 3: Comparing equations (4.29) with equation (4.30), if we choose 1 = 2L(tj) x̂j(tj) (tj) 2 = zj 3 = 1 2 then equations (4.29) imply that 2ŵ = 3 2v̂ = 4: Hence we can realize equations (4.29) as _̂ xj = Ax̂j(t) + 2B1B0 1 j +B2u _ j = A0 j(t) + C 0 1C1x̂j ŵ = 2B0 1 j v̂[tj] = 1 2 2D21[tj]0 [tj] with boundary condition j(tj) = 2L(tj) x̂j(tj) (tj) + 1 2C2(tj) [tj] and where x̂j(tj+1) = x(tj+1) is given. Then the constraint can be written as ~ y[tj] = [tj]C2(tj)x̂j(tj) + [tj]D21[tj]v[tj] = [tj]C2(tj)x̂j(tj) 12 2 [tj] 4.5 One step delayed output feedback 68 which gives an expression for the Lagrange multiplier 1 2 [tj] = 2 ~ y[tj] [tj]C2(tj)x̂j(tj) and hence the worst case disturbance is v̂[tj] = D21[tj]0 ~ y[tj] [tj]C2(tj)x̂j(tj) : Now let Yj be the solution to the Riccati di erential equation de ned on [tj; tj+1] _ Yj = AYj + YjA0 + 2YjC 0 1C1Yj +B1B0 1 and let xj(t) = x̂j(t) 2Yj(t) j(t): We now substitute xj and Yj for j. This gives _̂ xj = Ax̂j +B1B0 1Y 1 j (x̂j xj) +B2u _ xj = A xj + 2YjC 0 1C1 xj +B2u ŵ = B0 1Y 1 j (x̂j xj) v̂[tj] = D21[tj]0 ~ y[tj] [tj]C2(tj)x(tj) : on [tj; tj+1]. The boundary conditions at tj are given by xj(tj) = x̂j(tj) Yj(tj)L(tj) x̂j(tj) (tj) + 2Yj(tj)C 0 2(tj) ~ y(tj) C2x̂j(tj) : We would like to choose this xj(tj) and Yj(tj) so that xj is independent of x̂j. In order to do this, we choose xj(tj) and Yj(tj) so that the above equation is true for all x̂j(tj). This gives Y 1 j (tj) = L(tj) + C2(tj)0 [tj]C2(tj) (4.31) xj(tj) = Yj C 0 2 y(tj) + L (tj) : (4.32) Note that the latter of these conditions can also be written as xj(tj) = (tj) + YjC 0 2 y(tj) C2 (tj) We now wish to evaluate P tj+1; x(tj+1) . In order to do this, let ej = x̂j xj. Then _ ej = Aej 2YjC 0 1C1 xj + B1B0 1Y 1 j ej and consider e0jY 1 j ej. Di erentiating this gives d dte0jY 1 j ej = e0jY 1 j B1B0 1Y 1 j ej 2 2 x0jC 0 1C1ej 2e0jC 0 1C1e = 2z0z + ŵ0ŵ + 2 x0jC 0 1C1 xj 4.5 One step delayed output feedback 69 Then P tj+1; x(tj+1) = 2 x̂j(tj) (tj) 0L(tj) x̂j(tj) (tj) + Z tj+1 tj nz(t)0z(t) 2ŵ(t)0ŵ(t) dt 2v[tj]0v[tj] = 2 x̂j(tj) (tj) 0L(tj) x̂j(tj) (tj) 2e(tj+1)0Y 1 j (tj+1)e(tj+1) + 2e(tj)0Y 1 j (tj)e(tj) 2v[tj]0v[tj] + Z tj+1 tj nz(t)0z(t) 2w(t)0w(t) + 2 d dte0jY 1 j ejo dt = 2 x̂j(tj) (tj) 0L(tj) x̂j(tj) (tj) + 2e(tj)0Y 1 j (tj)e(tj) 2v[tj]0v[tj] 2e(tj+1)0Y 1 j (tj+1)e(tj+1) + Z tj+1 tj n x0jC 0 1C1 xj + u(t)0u(t)o dt We make use of equation (4.31), and evaluate all expressions at tj in the following manipulations: 2(x̂j xj)0Y 1 j (x̂j xj) 2(x̂j )0L(x̂j ) = 2(x̂j xj)0(L+ C 0 2 C2)(x̂j xj) 2(x̂j )0L(x̂j ) = 2e0jC 0 2 C 0 2ej 2 2e0jL( xj ) 2( xj )0L( xj ) Now note that combining equations (4.31) and (4.32) gives C 0 2 ~ y C 0 2 C2 xj = L( xj ) so 2(x̂j xj)0Y 1 j (x̂j xj) 2(x̂j )L(x̂j ) = 2e0jC 0 2 C2ej 2 2e0jC 0 2 ~ y + 2 2e0jC 0 2 C2 xj 2( xj )0L( xj ): Then v̂0v̂ = ~ y0~ y + x0jC 0 2 C2 xj 2~ y0 C2 xj + e0jC 0 2C2ej 2~ y0 C2ej + 2 x0jC 0 2 C2ej = ~ y0~ y + x0jC 0 2 C2 xj 2~ y0 C2 xj + ( xj )0L( xj ) + (x̂j xj)0Y 1 j (x̂j xj) (x̂j )0L(x̂j ): Therefore we arrive at P tj+1; x(tj+1) = 2 x(tj+1) xj(tj+1) Y 1 j (tj+1) x(tj+1) xj(tj+1) + h 2~ y0~ y 2 x0jC 0 2 C2 xj + 2 2~ y0 C2 xj 2( xj )0L( xj )i(tj) + Z tj+1 tj u(t)0u(t) + x0jC 0 1C1 xj dt+ c[tj] 4.5 One step delayed output feedback 70 Hence P tj+1; x(tj+1) has the form of equation (4.27), with L(tj+1) = Y 1 j (tj+1) (tj+1) = xj(tj+1) giving the recursive formulae Yj(tj) = Y 1 j 1(tj) + C2(tj)0 [tj]C2(tj) 1 Y 1(t0) = Q 1 0 (4.33) _ Yj = AYj + YjA0 + 2YjC 0 1C1Yj +B1B0 1 (4.34) and xj(tj) = xj 1(tj) + Yj(tj)C2(tj)0 [tj] ~ y[tj] C2 xj 1(tj) x 1(t0) = 0 _ xj = A xj + 2YjC 0 1C1 xj +B2u Note that c[tj+1] depends only on y[t0;tj ] and u[t0;tj ] as expected. Proposition 4.9. As de ned above, for each j = 0; : : : ; f , the matrix Yj(t) is strictly positive de nite for all t 2 [tj; tj+1]. Proof. This is easily demonstrated by induction. Since Q0 > 0, Y0(t0) > 0. Proposition 3.19 implies that, if Yj(tj) > 0, then Yj(t) > 0 for all t 2 [tj; tj+1]. Further, if Yj(tj+1) > 0, then equation (4.33) implies Yj+1(tj+1) > 0 also. 4.5.3 Recoupling For the multi-rate sampled-data problem, we can de ne the partial information value function as W tj; P (tj; ) := inf 2Ump [tj ;tf 1] sup ~ y2`2[tj ;tf 1] sup x(tf )2RnnP tf ; x(tf ) + x(tf )0Qfx(tf )o: Note that the arguments W are of di erent types; the rst is tj, a real number, and the second is P (tj; ) which is a function. Thus, with respect to its second argument, W is a functional. Also here Ump is the subset of Up for which R?[ti]~ u[ti] = R?[ti]~ u[ti 1], that is the set of information state feedback control laws satisfying the multi-rate hold constraints. Then the following is the sampled-data analogue of Corollary 3.29. Lemma 4.10. The following equality holds: sup w2L2[t0;tf ] v2`2[t0;tf ] sup x02Rn J(x0; ; w; v) = sup ~ y2`2[t0;tf 1] sup x(tf )2RnnP tf ; x(tf ) + x(tf )0Qfx(tf )o: Proof. The proof follows exactly the lines of that of Corollary 3.29, so is omitted. It is straightforward to verify that the results Lemma 3.28 to Theorem 3.31 from our discussion on the discrete time problem hold with the minor changes necessary for the multi-rate sampled data problem, since the proofs of that succession simply involve 4.5 One step delayed output feedback 71 successive maximizations and minimizations, and there are no order changes. The function W satis es the dynamic programming recursion W tj; P (tj; ) = inf R[tj ]~ u[tj ] sup ~ y[tj ]W tj+1; P (tj+1; ) with boundary condition W tf ; P (tf ; ) = sup x(tf )nP tf ; x(tf ) + x(tf )0Qfx(tf )o: Certainty equivalence. In order to apply the certainty equivalence principle to this problem, we would now like to verify condition the saddle point condition (3.41) of Theorem 3.31. Let J1 be de ned by J1(tj) := P tj; x(tj) + Z tj+1 tj z(t)0z(t) 2w(t)0w(t) dt+ V tj+1; x(t j+1); R?[tj+1]~ u[tj+1] (4.35) In fact, J1(tj) can be written with arguments as J1(tj; x(j); u[t0;tj ]; y[t0;tj ]; w[tj ;tj+1]; v[tj]), but we omit these for notational convenience. Condition (3.41) can be written inf R[tj ]~ u[tj ]2Rm2 sup x(tj )2Rn sup v[tj ] sup w2L2[tj ;tj+1]J1(tj) = sup x(tj)2Rn inf R[tj ]~ u[tj ]2Rm2 sup v[tj ] sup w2L2[tj ;tj+1]J1(tj) We can now use equation (4.20) to substitute into this equation the expression for supw2L2[tj ;tj+1] J1(tj). However, equation (4.20) is for the system with jumps; therefore the state in that equation is in fact given by x(ti)0 ~ u[ti]0 0 in terms of the state of the sampled-data system here, as can be seen from equation (4.9). This implies that sup v[tj ] sup w2L2[tj ;tj+1]J1(tj) = P tj; x(tj) + x(tj) ~ u[tj] 0 Zj(tj) x(tj) ~ u[tj] = 2 x(tj) xj 1(tj) 0Y 1 j 1(tj) x(tj) xj 1(tj) + c[tj] + x(tj) ~ u[tj] 0 Zj(tj) x(tj) ~ u[tj] since this is an unconstrained optimization. This is consistent with the one step delayed information pattern, and corresponds to the fact that during that one step the disturbance can attempt to maximize the cost function with the controller having to wait until the next time tj+1 to act. Applying Lemma 3.32, we see that a su cient condition for the saddle point condition between u(ti) and x(ti) is that 2Y 1 i 1(ti) + Zi11(ti) < 0 R[ti]Zi22(ti)R[ti]0 > 0 The latter of these conditions follows from Theorem 4.7, using the fact that R[ti] is full row rank. The former condition is the coupling condition. 4.5 One step delayed output feedback 72 Lemma 4.11. If 2Y 1 i 1(ti) + Zi11(ti) < 0, then 2Y 1 i 1(ti) +X11[ti] < 0. Proof. From equation (4.23), X11[ti] = Zi11 Zi12R0(RZi22R0) 1RZ 0 i12 (ti) where we have use the fact that Zi is symmetric. Since Zi22 > 0, the result follows. Remark 4.12. Note that the above coupling condition is analogous to that in the one step delayed discrete time problem. The matrix Zi11 is the quadratic coe cient of x(ti) in the value function before the minimization with respect to u(ti) takes place for that iteration of the Isaacs recursion. This might be expected, since during the `one step' w has the opportunity to maximize the cost function with u only having open loop control. Similar conditions occur also in the H1 problem with delayed measurements, see for example [54]. Theorem 4.13. Suppose, for each i = 0; : : : ; f , there exists a bounded solution to each of the Riccati di erential equations (4.34) on the interval [ti; ti+1], with Y 1(t0) = Q0. Suppose also that the conditions of Theorem 4.7, are satis ed, and that for each i = 0; : : : ; f the coupling condition 2Y 1 i 1(ti) + Zi11(ti) < 0 holds. Let x be de ned by x(ti) = I + 2Yi 1(ti)X11[ti] 1 xi 1(ti) + 2Yi 1(ti)X12[ti]R?[ti]~ u[ti] and the controller given by R[ti]~ u[ti] = RZi22R0 1 R[ti]Zi12 x(ti) + Zi22R?0R?~ u[ti] : Then this controller is optimal for the di erential game problem (4.25). Further, this controller gives the closed loop induced norm bound kT k . Proof. Applying Theorem 3.31, let x(ti) = arg max x(ti)2RnnP tj; x(tj) + V tj; x(t j ); R?[tj]~ u[tj] o: Then, since the multi-rate version of condition (3.41) holds, we can use x as if it were the actual state in the state feedback control law, and the resulting output feedback controller will be optimal. We have x(ti) = argmax x n 2 x xi 1 0Y 1 i 1 x xi 1 + c[ti] + x R?~ u 0X x R?~ u o (4.36) where all quantities on the right hand side are evaluated at ti. Using Lemma 4.11, we can di erentiate the above expression to give 2Y 1 i 1( x xi 1) +X11 x +X12R?~ u = 0 4.5 One step delayed output feedback 73 hence x = (I + 2Yi 1X11) 1( xi 1 + 2Yi 1X12R?~ u): This then gives the optimal controller. Corollary 3.30 then gives the optimal value. Since x 1(t0) = 0, R?[t0] = 0, and Y 1 1 (t0) = Q0, we have, with the optimal control in place sup w2L2[t0;tf ] sup v2`2[0;f ] sup x02Rn J(x0; ; w; v) = W t0; P (t0; ) = sup x02RnnP t0; x(t0) + V t0; x(t 0 ); R?[t0]~ u[t0] o = sup x02Rnn 2x00Q0x0 + x00X11[t0]x0o = 0: 4.5.4 Necessity Theorem 4.14. Suppose there exists j, 0 j f , such that 2Y 1 j 1(tj) + Zi11(tj) > 0: Then sup w2L2[t0;tf ] sup v2`2[t0;tf ] sup x02Rn J(x0; ; w; v) is in nite for all controllers . Proof. With J1 de ned as in equation (4.35), we have W tj; P (tj; ) = inf R[tj ]~ u[tj ] sup ~ y[tj ]W tj+1; P (tj+1; ) = inf R[tj ]~ u[tj ]2Rm2 sup x(tj)2Rn sup v[tj ] sup w2L2[tj ;tj+1] J1(tj) = inf R[tj ]~ u[tj ]2Rm2 sup x(tj)2Rn 2 x(tj) xj 1(tj) 0Y 1 j 1(tj) x(tj) xj 1(tj) + c[tj] + x(tj) ~ u[tj] 0 Zj(tj) x(tj) ~ u[tj] Hence, if 2Y 1 i 1(ti) + Zi11(ti) > 0 then W tJ ; P (tJ ; ) will be unbounded at time j. Therefore, the upper value of the game will be unbounded for all choices of ~ u[tj]. Theorem 4.15. Suppose, for some j, 0 j f , the Riccati equation (4.34) is unbounded on [tj; tj+1]. Then for any controller , sup w2L2[t0;tf ] sup v2`2[t0;tf ] sup x02Rn J(x0; ; w; v) is in nite. 4.6 Summary 74 Proof. If for some i, the Riccati di erential equation (4.34) is unbounded on [ti; ti+1], then for all y the information state P ti; x(ti) will be unbounded above also. Then, since the upper value satis es the forward dynamic programming recursion (3.36), we know that P ttf ; x(ttf ) will be unbounded also, and the result follows. It is interesting to consider the Riccati equations as continuous equations with jumps. At each time tj, Yj(tj) = Y 1 j 1(tj) + C2(tj)0 [tj]C2(tj) 1 (4.37) so, writing Q(t) = Y 1 j (t) for t 2 (tj; tj+1), then _ Q = QA + A0Q +QB1B0 1 + 2C 0 1C1 + QC 0 2RC2 where Q(t) =Xj (t tj) and is the Dirac delta function. Here we can see that the Riccati equation has jumps corresponding to the gain of information at each tj. Also, the x equation becomes _ xj = A x + 2YjC 0 1C1 x+B2u+ QYj(tj)C2(tj)0 [tj] ~ y[tj] C2 x(tj) (4.38) and similarly has jumps at ti. In this sense the sampling operation is causing discrete changes in the information state at each time ti, and these changes do in fact correspond to a discretization of the variation in the information state caused by continuously available measurements. 4.6 Summary In this chapter we have given necessary and su cient conditions for the existence of a -feasible controller for the nite horizon induced 2-norm problem for multi-rate sampled-data systems. We have also given a construction technique for such controllers. For the more general problem of a system with jumps, we have also given necessary and su cient conditions for existence of a -feasible state feedback controller, and an expression for one such controller. Using the techniques of game theory and dynamic programming, we have shown that the controller which minimizes the upper value of the game does not need to know future information about the times at which the sampled-data information becomes available. But it does require complete future information about the times at which the hold operator becomes active. This appears to be a signi cant obstacle to extending this formulation to the in nite time problem. In some sense, given the dynamic formulation we are using, and in comparison with the corresponding results in the continuous time problem, we expect this kind of result. In the continuous time case, the controller Riccati di erential equation for X1 is solved backwards in time from an endpoint condition, and the ltering Riccati di erential equation for Y1 is integrated forwards in time from an initial condition. 4.6 Summary 75 But this also would indicate that the special nature of the minimization we are performing has imposed the rather strict requirement of full future knowledge of the hold times. Indeed, we know that, in the continuous time case, the controller which minimizes the upper value of the di erential game is far from the only -suboptimal H1 controller. Indeed, the \central" controller of [22] is not this controller. One viewpoint on this nonuniqueness is that, for the induced 2-norm problem, we do not really need to minimize the upper value of the di erential game. All we need to do is ensure boundedness of the upper value. This corresponds to the idea that we do not need to ensure that X1(t0) in the continuous time case is small, we simply need to ensure that it is nite. In the sequel, this idea will lead us onto the idea of Riccati di erential inequalities, which correspond to the LMI formulation ofH1 control in the in nite horizon case. This suggests a future approach to the multi-rate sampled-data problem, which would enable us to tackle in nite horizon problems, essentially by the combination of the results in this chapter and those in the subsequent chapters. We will not pursue this further here, but suggest it as an opportunity for future research. 5. RICCATI DIFFERENTIAL INEQUALITIES The solution to the H1 problem for the time invariant in nite horizon case by Glover and Doyle [28] was characterised in terms of two algebraic Riccati equations. The subsequent solution of the time varying problem was similarly characterised by Khargonekar, Ravi, Nagpal [43] in terms of two Riccati di erential equations. However, in the time varying case these solutions are not so straightforward to apply. The control Riccati di erential equation for X1 must be integrated over all future time, and in the nonsingular problem it is required that A (B2B0 2 2B1B0 1)X1 is exponentially stable. For a general time varying system it is di cult both to construct X1 and to test if this condition holds. The solution to the time varying problem has also been expressed in terms of Riccati di erential inequalities, or linear matrix inequalities with a di erential term, in the parameter varying case by Wu, Yang, Packard and Becker [79]. However, in this solution the resulting controller depends explicitly on _ X1 and hence also on the rate of change of the parameter. This solution is derived by algebraic manipulation of the bounded real lemma applied to the closed loop system, and it is di cult to see how the rate dependence might be removed. In this chapter, we derive a solution to the time varying H1 suboptimal synthesis problem in terms of two Riccati di erential inequalities also. However, we do not require the above a-priori condition that A (B2B0 2 2B1B0 1)X1 be exponentially stable, and the controller formulae do not depend explicitly on _ X1. We make use of the separation theory of Doyle, Glover, Khargonekar and Francis [22], also known as DGKF, and we also separately derive a solution in terms of inequalities using the game theory of Chapter 3. We will apply the results of this chapter in Chapter 6 to the moving horizon problem, whereby we will further remove the restriction that the solution for X1 be integrated over all future time. 5.1 Preliminaries We de ne a nite dimensional time varying linear system P by its state space realization _ x(t) = A(t)x(t) +B(t)u(t) x(0) = x0 y(t) = C(t)x(t) +D(t)u(t): The system is de ned for t 2 R+, and, at time t, the state x(t) 2 Rn, and the input and output signals have dimensions u(t) 2 Rm and y(t) 2 Rp respectively. The system matrices A, B, C and D are bounded functions of time. We will write (t1; t2) for the transition matrix of A. The following lemmas are proved in [58]. Lemma 5.1. Suppose the pair (A;C) is detectable, and the matrix function P satis es, for all t 0, 76 5.2 Inequalities via DGKF separation 77 a) _ P (t) + A(t)0P (t) + P (t)A(t) C(t)0C(t) b) There exists > 0 such that P (t) I c) P (t) 0 then the system _ x(t) = A(t)x(t) is exponentially stable. Lemma 5.2. Suppose the pair (A;B) is stabilizable, and the matrix function Q satis es, for all t 0, a) _ Q(t) A(t)Q(t) Q(t)A(t)0 B(t)B(t)0 b) There exists > 0 such that Q(t) I c) Q(t) 0 then the system _ x(t) = A(t)x(t) is exponentially stable. 5.2 Inequalities via DGKF separation In this section we use the separation theory of DGKF [22] to give a solution to the in nite horizon time varying H1 problem. In particular, our solution technique is essentially the same as that of Khargonekar, Ravi and Nagpal [43]. However, we show that the controller can be constructed using any solution to a pair of Riccati di erential inequalities, and remove particular stability requirements on their solutions. 5.2.1 The bounded real lemma Lemma 5.3. Suppose G : L2 ! L2 is an exponentially stable linear time varying system with G : u 7! y de ned by _ x(t) = A(t)x(t) +B(t)u(t) x(0) = 0 y(t) = C(t)x(t) where A 2 L n n 1 , B 2 L n m 1 and C 2 L p n 1 . Then, given > 0, a) if there exists X 2 L n n 1 such that _ X + A0X +XA+ 2XBB0X + C 0C 0 8 t 2 R+ (5.1) then kGk . b) if, in addition to the conditions of a) above, the matrix function A + 2BB0X is exponentially stable, then kGk < . Proof. a) A simple completion of squares gives d dtx(t)0X(t)x(t) y(t)0y(t) + 2u(t)0u(t) 2ju(t) 2B0(t)X(t)x(t)j2: 5.2 Inequalities via DGKF separation 78 Integrating this expression over [0;1), and using x(0) = 0 and exponential stability, we arrive at kyk22 2kuk22 2ku 2B0Xxk22 (5.2) Since the induced norm of G is de ned by kGk2 := sup u2L2 kyk22 kuk22 this implies that kGk . For b), note that kGk < if and only if there exists " > 0 such that kyk22 2kuk22 "kuk22 for all u 2 L2. Let v = u 2B0Xx, then the map from v to u is given by A + 2BB0X B 2B0X I hence, if A+ 2BB0X is exponentially stable, this is a bounded linear operator and has a nite induced norm. Therefore there exists > 0 such that kuk22 ku 2B0Xxk22, and equation (5.2) becomes kyk22 2kuk22 2 kuk22 giving the desired result. The converse of this result can be stated as follows. Lemma 5.4. Suppose G is an exponentially stable linear time varying system de ned as in Lemma 5.3. Then if kGk < , there exists X 2 L n n 1 , with X 0, such that _ X + A0X +XA+ 2XBB0X + C 0C 0 8 t 2 R+ Proof. See Tadmor [65]. Remark 5.5. Note that the above results can be shown without assuming di erentiability of X, or even nite dimensionality of the system matrices. In this case, the Riccati di erential equations are replaced by integral equations. In the sequel, we will write the Riccati equations in the di erential form. Remark 5.6. Even if we nd or construct X which satis es inequality (5.1) strictly, this alone does not guarantee that kGk is strictly less than . However, we would like a simpler condition than stability of A+ 2BB0X, since this appears to be an untestable condition for arbitrary X. Remark 5.7. The bounded real lemma as stated above makes the important assumption that the system is exponentially stable. Often we would like to use this in a synthesis framework, with a view to constructing an X satisfying (5.1). In order to do this, we would like to be able to use this as a test for a system whose stability is unknown. The next result gives a condition for this purpose. 5.2 Inequalities via DGKF separation 79 Lemma 5.8. Suppose G : L2e ! L2e is a linear time varying system, not a priori known to be stable, with G : u 7! y de ned by _ x(t) = A(t)x(t) +B(t)u(t) x(0) = 0 y(t) = C(t)x(t) where A 2 L n n 1 , B 2 L n m 1 and C 2 L p n 1 . Then, given > 0, if there exists a positive de nite X 2 L n n 1 and " > 0 such that _ X + A0X +XA+ 2XBB0X + C 0C "I 8 t 2 R+ then G is exponentially stable, and kGk < . Proof. As in Lemma 5.3, we easily arrive at kyk22 2kuk22 2ku 2B0Xxk22 "kxk22: Since C is a bounded matrix function, there exists > 0 such that kyk22 kxk22, which implies 1 + " kyk22 2kuk22 which therefore implies kGk2 2 1 + " 1 < 2: The system G is exponentially stable since X satis es _ X + A0X +XA "I 8 t 2 R+ and since (A; "1=2I) is detectable, stability follows from Lemma 5.1. The case when G has a direct feedthrough term is not signi cantly more di cult to derive, but results in more complicated expressions, as follows. Lemma 5.9. Suppose G : L2e ! L2e is a linear time varying system with G : u 7! y de ned by _ x(t) = A(t)x(t) +B(t)u(t) x(0) = 0 y(t) = C(t)x(t) +D(t)u(t): Suppose also kDk1 < . Then a) if G is exponentially stable, then if there exists X 2 L n n 1 satisfying _ X + A0X +XA+ 1C 0C + (XB + 1C 0D)( 2I D0D) 1(B0X + 1D0C) 0 (5.3) then kGk . 5.2 Inequalities via DGKF separation 80 b) If, in addition to the conditions for a), the time-varying matrix function A + B( 2 D0D) 1(D0C +B0X) is exponentially stable, then kGk < . c) If G is not known a priori to be stable, then if (A;C) is detectable and there exists X 2 L n n 1 satisfying equation (5.3) with X 0 then kGk and G is exponentially stable. Note that it is easy to construct a similar result for (A;B) stabilizable by consideration of kG k. d) If G is not known a priori to be stable, then if the inequality (5.3) is uniformly satis ed by X 2 L n n 1 with X 0 then G is exponentially stable and kGk < . By uniformly here we mean that the left hand side is less than "I for some constant " > 0, Proof. The proof follows exactly that of Lemmas 5.3 and 5.8 so is omitted. Remark 5.10. Note that the X used in equation (5.3) is not the same as that used in equation (5.1). Substituting D = 0 into (5.3) gives _ X + A0X +XA+ 1XBB0X + 1C 0C 0 which is equivalent to _̂ X + A0X̂ + X̂A+ 2X̂BB0X̂ + C 0C 0 with X̂ = X. Remark 5.11. An alternative form for the Riccati di erential inequality (5.3) is _ X + (A+B( 2I D0D) 1D0C)0X +X(A+B( 2I D0D) 1D0C) + XB( 2I D0D) 1B0X + 1C 0(I +D( 2I D0D) 1D0)C 0: Remark 5.12. Another alternative form for the Riccati di erential inequality (5.3) is _ X + A0X +XA+ B0X C 0 I D0 D I 1 B0X C 0 (5.4) Remark 5.13. Yet another alternative form for the strict version of the Riccati di erential inequality (5.3) is 24 _ X + A0X +XA XB C 0 B0X I D0 C D I35 < 0 This is the Linear Matrix Inequality formulation. Applying the Schur complement formula allows conversion to the form (5.4), from which another application of the Schur formula results in the form (5.3). 5.2 Inequalities via DGKF separation 81 Note that the nonstrict version of this inequality is not equivalent to the nonstrict Riccati form. In fact, it is easy to see [15, p. 28] that Q S S 0 R 0 if and only ifR 0 Q SRyS 0 0 S(I RRy) = 0 and so in general the nonstrict LMI is a stronger condition than the nonstrict Riccati inequality. In the case where kDk1 < , the matrix R is invertible and the two forms are equivalent. Note however that we have not said anything about the case when kDk1 = . P Q rz wv Figure 5.1: Feedback con guration for Lemma 5.14 We will need the following lemma. Lemma 5.14. Let P be a system partitioned as P = P11 P12 P21 P22 and let Q be another system connected to P as in Figure 5.1. Assume that P and Q are exponentially stable. Suppose that P satis es 2krk22 + kzk22 2kwk22 + kvk22 for all w; v 2 L2: (5.5) Then if kQk < , the closed loop system is exponentially stable and kTzwk . Further, if P21 has an exponentially stable inverse, then kTzwk < . Proof. From (5.5) with w = 0, the norm kP22k 1, and hence kQk < implies that the closed loop is internally stable, since P and Q are both stable. The nonstrict inequality then follows, since kzk22 2kwk22 kvk22 2krk22 0: For the strict inequality, rst note that kQk < implies there exists " > 0 such that kvk22 2krk22 "krk22. Then if P21 has an exponentially stable inverse, then w = P 1 21 (I P22Q)r = Twrr, with Twr a bounded operator. Therefore kwk22 kTwrk2krk22 giving kvk22 2krk22 "krk22 " kTwrk2kwk22: 5.2 Inequalities via DGKF separation 82 Thus kzk22 2kwk22 " kTwrk22kwk22 and the result follows. Let G be the nite dimensional linear time varying system with state space realization _ x(t) = A(t)x(t) +B1(t)w(t) +B2(t)u(t) x(0) = 0 z(t) = C1(t)x(t) +D12(t)u(t) y(t) = C2(t)x(t) +D21(t)w(t) (5.6) de ned on R+, with all matrices bounded. We will make the following assumptions about G. a) D12(t)0D12(t) = I for all t 2 R+ b) D21(t)D21(t)0 = I for all t 2 R+ c) (A;B2) is stabilizable d) (A;C2) is detectable We would now like to derive a set of su cient conditions, based on Riccati di erential inequalities, for existence of a stabilizing controller such that kTzwk < . In the proof, we will use the separation theory of [22], and essentially follow [43] for the time varying case. Theorem 5.15. Let G be a system of the form (5.6) satisfying the above assumptions. Then there exists a controller K which exponentially stabilizes the closed loop system and results in kTzwk < if a) there exists a bounded non-negative de nite solution X to the Riccati di erential inequality _ X (A B2D0 12C1)0X +X(A B2D0 12C1) X(B2B0 2 2B1B0 1)X + C 0 1D? 12D?0 12C1 (5.7) such that A B2D0 12C1 (B2B0 2 2B1B0 1)X is exponentially stable, and b) there exists a bounded non-negative de nite solution Z to the Riccati di erential inequality _ Z AzZ + ZA0z Z(C 0 zCz 2C 0 kCk)Z +B1D?0 21D? 21B1 (5.8) such that Az Z(C 0 zCz 2C 0 kCk) is exponentially stable, where Az = A B1D0 21C2 + 2B1D?0 21D? 21B0 1X Ck = D0 12C1 B0 2X Cz = C2 + 2D21B0 1X: 5.2 Inequalities via DGKF separation 83 If these conditions are satis ed, then an admissible controller K is given by _ x = Ak x+Bky x(0) = 0 u = Ck x (5.9) where Ak = A+ 2B1B0 1X +B2Ck (B1D0 21 + ZC 0 z)Cz Bk = B1D0 21 + ZC 0 z: Proof. We use the by now standard change of variables, rst introduced in [22], setting v = u Ckx and r = w 2B0 1Xx. Then we can rewrite the closed loop in terms of these variables, resulting in a system P de ned by _ x = (A+B2Ck)x +B1w +B2v z = (C1 +D12Ck)x+D12v r = 2B0 1Xx+ w and a controller Q given by _ e = (Az ZC 0 zCz)e + (B1 BkD21)r v = Cke: where e = x x. We aim to show that these satisfy the assumptions of Lemma 5.14. First, we show that P is exponentially stable. Rewrite (5.7) as _ X+(A+B2Ck)0X+X(A+B2Ck) XB2B0 2X 2XB1B0 1X C1D? 12D?0 12C1: (5.10) By assumption, A + B2Ck + 2B1B0 1X is exponentially stable, and so the pair (A + B2Ck; 1B0 1X) is detectable. Then it follows from Lemma 5.1 that P is exponentially stable. In order to show the norm bound, algebra gives d dtx0Xx 2w(t)0w(t) z(t)0z(t) + v(t)0v(t) 2r(t)0r(t): Since P is exponentially stable, and x(0) = 0, integrating over [0;1) gives 2krk22 + kzk22 2kwk22 + kvk22: Partitioning P in the obvious way, straightforward manipulations give P 1 21 = A B2D0 12C1 (B2B0 2 2B1B0 1)X B1 2B0 1X I which is exponentially stable by assumption. Next we establish stability of Q. Rewrite (5.8) as _ Z (Az ZC 0 zCz)Z Z(Az ZC 0 zCz)0 ZC 0 zCzZ + 2ZC 0 kCkZ +B1D?0 21D? 21B0 1: 5.2 Inequalities via DGKF separation 84 Since by assumption Az ZC 0 zCz + 2ZC 0 kCk is exponentially stable, the pair (Az ZC 0 zCz; 2ZC 0 k) is stabilizable, and thus by Lemma 5.2, Az ZC 0 zCz is exponentially stable. In order to show that kQk < , we establish that kQ k < , where Q is the dual of Q, de ned on ( 1; 0]. The dual system Q has realization _ e = (A0z C 0 zCzZ)e C 0 kr v = (B0 1 D0 21B0 k)e where the time varying system matrices are now functions of t, where Z satis es the time reversed Riccati inequality _ Z (Az ZC 0 zCz)Z Z(Az ZC 0 zCz)0 ZC 0 zCzZ 2ZC 0 kCkZ B1D?0 21D? 21B0 1: Then algebra gives d dte 0Ze 2r 0r 2jr + 2CkZe j2 v 0v and, since Az ZC 0 zCzis exponentially stable, integrating over ( 1; 0] gives kv k22 2kr k22 2kr + 2CkZe k22 e (0)0Z(0)e (0): Let L be the system mapping r to r + 2CkZe . Then L = A0z C 0 zCzZ C 0 k 2CkZ I which has inverse L 1 = A0z C 0 zCzZ 2C 0 kCkZ C 0 k 2CkZ I : By assumption, A0z C 0 zCzZ is exponentially stable, and therefore L 1 is a bounded linear operator. So kv k22 2kr k22 2 kL 1k2krk22 e (0)0Z(0)e (0) for all r 2 L2( 1; 0] and since Z(0) 0, we have kQk = kQ k < . Hence we have shown that, with this controller, the constructed systems P and Q satisfy the assumptions of Lemma 5.14, and so the closed loop norm kTzwk < and the system is internally stable. Remark 5.16. Note that the above theorem is in fact necessary also. To show necessity, we need only show that existence of solutions to the Riccati di erential equations is necessary, since these will also satisfy the inequalities. This is proved in the case when B1D0 21 = 0 and C 0 1D12 = 0 by Ravi, Nagpal, and Khargonekar in [43], and generalization to the case with cross terms is not di cult. We would now like to remove the assumptions in Theorem 5.15 that A B2D0 12C1 (B2B0 2 2B1B0 1)X Az Z(C 0 zCz 2C 0 kCk) are exponentially stable. In order to do this, as in the bounded real lemma, we will solve uniformly negative de nite Ricatti di erential inequalities. 5.3 Connections to differential games 85 Theorem 5.17. Suppose in addition to the above assumptions on G that the pair (A B2D0 12C1; D?0 12C1) is detectable. Then if a) there exists a bounded non-negative de nite solution X to the Riccati di erential inequality _ X + (A B2D0 12C1)0X +X(A B2D0 12C1) X(B2B0 2 2B1B0 1)X + C 0 1D? 12D?0 12C1 0 (5.11) and b) there exists a constant " > 0 and a bounded non-negative de nite solution Z to the Riccati di erential inequality _ Z + AzZ + ZA0z Z(C 0 zCz 2C 0 kCk)Z +B1D?0 21D? 21B0 1 "I (5.12) then the controller K de ned by equation (5.9) is stabilizing, and results in a closed loop induced norm which satis es kTzwk . Proof. The proof follows the same outline as that of Theorem 5.15, with the following changes in detail. First, note that to show stability of P, since (A B2D0 12C1; D?0 12C1) is detectable, so is (A B2D0 12C1 B2B0 2X; C 0 1D? 12 XB0 2 0), and therefore equation 5.10 implies that P is exponentially stable. Stability of Q is established by observing that (Az ZC 0 zCz; "1=2I) is stabilizable, and that the Riccati inequality in Z can be written _ Z (Az ZC 0 zCz)Z Z(Az ZC 0 zCz)0 ZC 0 zCzZ + 2ZC 0 kCkZ +B1D?0 21D? 21B0 1+ "I: Considering the time reversed version of this inequality as before gives kv k22 + "ke k22 2kr k22: Since kv k22 ke k22 for some > 0, we arrive at 1 + " kv k22 2kr k22 which implies kQk < . Then applying Lemma 5.14 gives kTzwk . 5.3 Connections to di erential games We also present here, for the nite horizon case, some connections to the game theoretic approach of Chapter 3. We will consider the linear time varying system on the nite horizon [t0; tf ] described by _ x(t) = A(t)x(t) +B1(t)w(t) +B2(t)u(t) x(t0) = x0 z(t) = C1(t)x(t) +D12(t)u(t) y(t) = C2(t)x(t) +D21(t)w(t) 5.3 Connections to differential games 86 We again make the standard assumptions that D21(t)B1(t)0 = 0 D12(t)0C1(t) = 0 for all t 2 [t0; tf ], and use the cost function J(x0; ; w) = 2x00Q0x0 + Z tf t0 z(t)0z(t) 2w(t)0w(t) dt+ x(tf )0Qfx(tf ) where Qf > 0 is a symmetric positive de nite matrix. In the state feedback case, we will assume the initial state x0 = 0, and in the output feedback case we will require Q0 > 0, and we will consider the initial state x0 to be part of the unknown disturbance, that is part of the maximizing player's strategy. We would then like to solve the output feedback problem inf 2Uof sup w2L2 x02Rn J(x0; ; w): The solution to this problem is well known, and is derived in Chapter 3. A -feasible controller exists satisfying sup w2L2 x02Rn J(x0; ; w) 0 if and only if there exist bounded solutions to the two Riccati di erential equations _ Y1 = AY1 + Y1A0 Y1(C 0 2C2 2C 0 1C1)Y1 +B1B0 1 Y1(t0) = Q 1 0 and _ X1 = A0X1 +X1A X1(B2B0 2 2B1B0 1)X1 + C 0 1C1 X1(tf) = Qf : satisfying the coupling conditionX1(t) 2Y1 1(t) < 0: 5.3.1 State feedback The state feedback result is easily derived. We construct a controller for the same system with cost function J(x0; ; w; 1) = Z tf t0 z0z + x0 1x 2w0w dt+ x(tf )0Qfx(tf ): where 1 0 is a given bounded positive semide nite matrix function of time, and the initial state x0 = 0. Clearly, if we can design a controller such that this cost function is negative for all disturbances w, then this same controller will also ensure that the original cost function is negative also. Let X satisfy _ X = A0X +X A X (B2B0 2 2B1B0 1)X + C 0 1C1 + 1 X (tf) = Q1: (5.13) 5.3 Connections to differential games 87 A trivial modi cation of the completion of squares argument of Chapter 3 gives J(x(t0); ; w; 1) = x(t0)0X (t0)x(t0) + k ( ; x( )) u k22 2kw w k22 x(tf )0(Q1 Qf )x(tf ) for any state feedback controller 2 Usf and any input w, where u (t) = B2(t)0X (t)x(t) and w (t) = 2B1(t)0X (t)x(t): Hence, if Q1 > Qf , since J(x0; ; w; 0) J(x0; ; w; 1) J(x(0); ; w; 0) x(t0)0X (t0)x(t0) + k ( ; x( )) u k22 2kw w k22: Therefore, when x(t0) = 0, the state feedback strategy t; x(t) = B2(t)0X (t)x(t) is a -feasible state feedback controller for the original problem. The important thing to note here is that the controller depends on X and not on X1. Also, since existence of a bounded solution for X implies existence of a -feasible state feedback controller, existence of a bounded solution for X implies existence of a bounded solution for X1. Further, the key point is that 1 is arbitrary. This means that we do not need to know it in order to synthesize a controller. This also allows us to avoid having to solve exactly the Riccati di erential equation, instead we may solve _ X A0X +X A X (B2B0 2 2B1B0 1)X + C 0 1C1 X (tf ) Qf (5.14) In particular, in the case when the Riccati di erential equation is monotonic, it is easy to guarantee that the solution we are constructing has a gradient below some prespeci ed value, whereas previously we could only assume that the constructed solution was feasible provided that the numerical solution to the Riccati di erential equation was su ciently close to the actual solution. Monotonicity is guaranteed for linear time invariant systems for speci c choices of the terminal weight Qf , as we shall show in the next Chapter. 5.3.2 Output feedback In the output feedback case, we will assume the cost function contains an initial state term, with Q0 > 0. Then Proposition 3.19 implies that both the RDEs for X1 and Y1 have strictly positive de nite solutions, since by assumption they have strictly positive de nite boundary conditions. Then we can write d dtY1 1 = Y1 1A + A0Y1 1 C 0 2C2 + 2C 0 1C1 + Y1 1B1B0 1Y1 1 Clearly we could simply add 1 to the X1 equation and 2 1 to the Y1 1 equation, since in this case we would be simply increasing the size of kzk22 by increasing the size of C 0 1C1, and if a controller is -feasible for this problem then it must be so for the smaller problem also. But by doing so we gain little. In fact, the only di erence would 5.3 Connections to differential games 88 be that we have made it harder to synthesis -feasible controllers, since we would then need to know 1 in order to construct solutions to both RDEs with the same 1. So we would like to add di erent positive matrices to each of the Riccati equations, and still guarantee the induced norm bound. In this section we will derive conditions under which we can do this. Our approach will be to again consider the certainty equivalence formulation as in Section 3.5.2, but again we will not rely on the proof of the general certainty equivalence theorem. In fact, since we would simply like to give su cient conditions for construction of a -feasible controller, we will see that the required result is conceptually very simple. Our approach will be to use di erent cost functions for the past and future halves of the cost function. De ne V t; x(t) := x(t)0X (t)x(t): As we have seen, V t; x(t) is an upper bound for the cost of the game with the minimax state feedback controller in place. Recall the de nition of the following disturbance class, which is the set of all possible (w; x0) disturbances consistent with information available at time . 0( ) := (x0; w) ; x0 2 Rn; w 2 L2[t0; ]; such that _ x(t) = A(t)x(t) +B1(t)w(t) + B2(t)u (t) y (t) = C2(t)x(t) +D21(t)w(t) for t 2 [t0; ] : In order to accommodate di erent initial conditions for the solution to Y1, de ne G( ; x0; u ; w ) := 2x00Q2x0 + Z t0 z0z + x0 2x 2w0w dt+ V ( ; x( )): (5.15) for some given matrix Q2 > 0, and also de ne the constrained optimization problem W ( ; u ; y ) = sup (x0;w )2 0( )G( ; x0; u ; w ) (5.16) Since we are only trying to prove su ciency, the following result is easily shown. Theorem 5.18. For a given controller , suppose that, for each 2 [t0; tf ], and for each y 2 [t0; ], the maximization problem (5.16) has a bounded solution. Further, suppose that for all t1; t2 2 [t0; tf ] with t2 t1, and for all y, W (t2; ut2 ; yt2) W (t1; ut1; yt1). That is, W is a decreasing function of when the signal u is de ned by the feedback law . Then, if Q2 Q0, sup x02Rn w2L2 J(x0; ; w; 2) sup x0 V (t0; x0) 2x00Q2x0 : Proof. Clearly, since Q2 Q0, J(x0; ; w; 2) W (tf ; utf ; ytf ): 5.3 Connections to differential games 89 Since W is decreasing, J(x0; ; w; 2) W (t0; ut0 ; yt0) = sup x02Rn V (t0; x0) 2x00Q2x0 8w 2 L2: Therefore, with this feedback strategy for the u player, the result holds. Let Y satisfy the following Riccati di erential equation: _ Y = AY + Y A0 Y (C 0 2C2 2C 0 1C1 2 2)Y +B1B0 1 Y (t0) = Q 1 2 : (5.17) Then, it is a trivial modi cation of Lemma 3.21 to show Lemma 5.19. Suppose there exists a matrix Y (t) satisfying the Riccati di erential equation (5.17) on [t0; tf ], and such that X ( ) 2Y 1( ) < 0 for 2 [t0; tf ]. Then there exists a unique maximum to the constrained maximization problem de ned by equation (5.16), and the worst case cost is given by W ( ; u ; y ) = Z t0 nx̂ 0(C 0 1C1 + 2)x̂ + u0u 2ŵ 0ŵ o dt + x̂ ( )0X ( )x̂ ( ) 2x̂ 0 0 Q2x̂ 0 where _̂ x = Ax̂ +B1B0 1Y 1x̂ B1B0 1Y 1 x +B2u x̂ (t0) = 0 _ x = A x+ 2Y (C 0 1C1 + 2) x + Y C 0 2(y C2 x) +B2u x(t0) = 0 with boundary conditionsx̂ ( ) = I 2Y ( )X ( ) 1 x( ) (5.18) and the worst case disturbance is given by ŵ = B0 1Y 1x̂ B0 1Y 1 x +D0 21(y C2x̂ ): (5.19) The following lemma is straightforward algebra. Lemma 5.20. Suppose Y satis es equation (5.17), and X satis es equation (5.13). Suppose Y (t)(I 2X (t)Y (t)) 1 > 0, and de ne Z (t) = Y (t)(I 2X (t)Y (t)) 1: Then Z ( ) satis es _ Z = (A+ 2B1B0 1X )Z + Z (A+ 2B1B0 1X )0 Z (C 0 2C2 2X B2B0 2X + 2( 1 2))Z +B1B0 1: 5.3 Connections to differential games 90 Remark 5.21. This result has some important consequences. The solution of the Riccati di erential inequalities in X1 and Y1 together with the coupling condition form a convex problem for each time t. However, we cannot use this to construct solutions to the Riccati di erential inequality in Z1. Let x( ) = x̂ ( ) for all . The key result of this section is the following one. Theorem 5.22. If there exist bounded solutionsX and Y to equations (5.13) and (5.17), then the function W ( ; u ; y ) satis es d d W ( ; u ; y ) = 2jy( ) C2( ) x( )j2 + ju( ) +B2( )0X ( ) x( )j2 x( )0( 1 2) x( ) Hence, if 1 2 > 0, then if u( ) = B2( )0X ( ) x( ) for all 2 [t0; tf ], then d d W ( ; u ; y ) 0 for all y . Then, with this controller, using Theorem 5.18, sup x02Rn w2L2 J(x0; ̂; w; 2) sup x0 V (t0; x0) 2x00Q2x0 = 0 since by assumption, Y 1(t0) = Q 1 2 , and X (t0) 2Y 1(t0) < 0. Therefore, this controller is -feasible for the original problem. Proof of Theorem 5.22. This proof is identical in method to that of Lemma 3.22. In the same way as that proof, algebra gives d dt(x̂ 0Y 1 x) = u 0B0 2Y 1 x x0Y 1B1B0 1Y1 1 x + y 0C2x̂ + x̂ 0Y 1B2u where terms in 2 cancel. We arrive at d d W ( ; u ; y ) = d d nx̂ ( )0 2Y 1( ) X ( ) x̂ ( )o + u( )0u( ) 2y( )0y( ) + 2 x( )0Y 1( )B1( )B1( )0Y 1( ) x( ) 2 2u( )0B2( )0Y 1( ) x( ) and x satis es x = (I 2Y X ) 1 x = Z Y 1 x and hence _ x = A x+ 2B1B0 1X x+X C 0 2(y C2 x) + 2Z X B2B0 2X x + Z Y 1B2u 2Z ( 1 2) x: Substituting as in Lemma 3.22 gives the desired result. 5.3 Connections to differential games 91 As it stands, this result is not quite what we would like to achieve, for two reasons. The rst is that we need to guarantee that the condition 1 2 holds. This means we need to have bounds on 1 and 2. The other reason is that the controller dynamics depend on x, which itself explicitly is a function of 1 2. We can cure both of these problems. To cure the latter problem, de ne ~ x(t) = Y 1(t) x(t). Then _ ~ x = A0~ x Y 1B1B0 1~ x + C 0 2y + Y 1B2u with boundary condition ~ x(t0) = 0. The controller formula is then u(t) = B0 2X (t)Z (t)~ x(t) and does not explicitly depend on the i matrices except through the Riccati solutions. In order to avoid interdependence of the i matrices, we make use of the following Lemma. Lemma 5.23. Suppose there exists a bounded solution to the the Riccati di erential equation _ Y = AY + Y A0 Y (C 0 2C2 2C 0 1C1 2 2)Y +B1B0 1 (5.20) on [t0; tf ] with boundary condition Y (t0) = Q 1 2 . If there exists a bounded strictly positive de nite solution to the Riccati di erential equation _ Z = (A+ (B1B0 1 2B2B0 2)Y 1)Z + Z (A+ (B1B0 1 2B2B0 2)Y 1)0 + Z ( 2Y 1B2B0 2Y 1 2 3 C 0 2C2)Z B1B0 1 + 2B2B0 2 (5.21) with boundary condition Z (tf) = Y 1(tf) 2Q1 1 (5.22) then the matrix X de ned by X (t) := 2Y 1(t) 2Z 1 (t) satis es the Riccati differential equation _ X = A0X +X A X (B2B0 2 2B1B0 1)X + C 0 1C1 + 3 + 2 (5.23) with boundary condition X (tf ) = Q1 (5.24) and the coupling condition Y 1(t) 2X (t) > 0 holds for all t 2 [t0; tf ]. Conversely, assuming there exist a bounded solution Y to (5.20) with boundary condition Y (t0) = Q 1 2 , then if there exists a bounded solution X to equation (5.23) with boundary condition (5.24) satisfying the coupling condition Y 1(t) 2X (t) > 0, then the matrix de ned by Z := Y 1 2X 1 satis es the Riccati di erential equation (5.21) with boundary condition (5.22). Proof. This is straightforward to show by direct di erentiation. 5.3 Connections to differential games 92 We now can guarantee that 1 2 by choosing arbitrary 2 0 and 3 0. Then the matrix function 1 is given by 2 + 3. As a result, we do not need to know the matrices explicitly; we can simply solve the Riccati di erential inequalities. Hence we can now construct controllers for the nite horizon time varying induced 2-norm problem by solving Riccati di erential inequalities. The solution presented here has the advantage over the simpler solution based upon the bounded real lemma by Wu et. al. [79] in that the controllers do not depend on _ X1 and _ Y1. This in turn implies that, if we solve this problem for parameter varying systems as a special case of the time varying problem, the resulting controller does not depend on the rate of change of the parameter. However, we lose an important feature of the solutions of [79], and that is that the constructive procedure here is not convex. Since the Riccati di erential inequality for Z depends on the product Y 1Z , we cannot solve both inequalities together and have the problem remain simultaneously convex in Y and Z , which we could do were we solving the original inequalities in X and Y . Thus we have shown that rate independent -suboptimal H1 controllers exists, but the construction we have given here is unfortunately not convex. Collecting these results gives the following theorem. Theorem 5.24. Suppose there exists bounded solutions to the Riccati di erential inequalities _ Y Y A A0 Y Y B1B0 1 Y + C 0 2C2 2C 0 1C1 Y (t0) Q0 and _ Z Z(A+ (B1B0 1 2B2B0 2) Y ) (A+ (B1B0 1 2B2B0 2) Y )0 Z + Z(B1B0 1 2B2B0 2) Z 2 Y B2B0 2 Y + C 0 2C2 Z(tf) Y 1(tf) 2Qf : Then the controller _ ~ x = A0~ x Y B1B0 1~ x + C 0 2y + Y B2u ~ x(t0) = 0 u = 2B0 2( Y Z 1 I)~ x guarantees kzk22 + x(tf )0Qfx(tf ) 2kwk22 + 2x00Q0x0 for all w 2 L2[t0; tf ] and all x0 2 Rn. Although the formulae derived here are similar to those of the previous section, they are not identical. We have now derived independent inequalities for Y and Z, rather than for X and Z. Also, the standard controller state description of Section 5.2 depends on the parameters 1 and 2 in this case. In this chapter we have extended the solution of [43] to remove the assumptions that the solution to the Riccati di erential equations are stabilizing. We have also shown 5.3 Connections to differential games 93 that these equations can be replaced by inequalities. For the dynamic game problem, we have perturbed the separated cost function to derived solutions in terms of Riccati di erential inequalities also. In the next chapter we will use the su cient conditions for existence and construction of -feasible controllers presented in this chapter to show that the moving horizon H1 control problem also results in stabilizing -feasible controllers. 6. MOVING HORIZON H1 CONTROL 6.1 Motivation The moving horizon control technique was developed using the linear-quadratic regulator in the 1970's, with one of the rst papers being that by Kleinman [44] in 1970. The techniques in his paper were later reformulated and generalized by various authors, see for example [18, 46, 45]. Kwon and Pearson [46] proved stability for linear time varying systems using a controller which optimized a quadratic cost function integrated over a time interval from the current time t to a xed distance ahead t + T . That is, at each time t, the controller solves inf 2Usf Z t+T t x(s)0R(s)x(s) + u(s)0u(s) ds+ x(t + T )0Qfx(t + T ): Theoretically, the resulting controller de ned on the interval [t; t + T ] is only actually used at time t before being recalculated. The solution to this problem was given in terms of a Riccati di erential equation, integrated backwards from time t + T at each time t. Kwon et. al. [45] also formulated a general procedure for the forwards integration of such solutions. This allows a continuous update of the time varying controller. If the terminal weight Qf is chosen to satisfy certain conditions, or is replaced by the terminal constraint x(t+ T ) = 0, then the closed loop system will be exponentially stable. Indeed, it is intuitively reasonable from the above form of cost function that a large terminal weight would tend to result in a stable closed loop system, and in fact for the quadratic state feedback problem, choosing Qf su ciently large will result in stability. We may also regard the constraint x(t + T ) = 0 as a limiting case, corresponding to Qf =1. For time varying systems, this gives a method of stabilizing the system which depends quantitatively on the parametric description of the system for only a nite time ahead, and depends on more qualitative features of the long term plant behaviour. Further, this method allows stabilization of systems which are known only for the short term future. In dealing with long term variation of systems, there are at present two extreme options. The rst is to assume a time invariant system and design a controller robust against time varying perturbations, which usually leads to very conservative controller design. The other is to design fully for time varying systems. In the case of both linear quadratic and H1 controllers, this requires the backwards integration from in nity of a Riccati equation (see for example [69, 43]), and somewhat optimistically assumes knowledge of the system throughout future time. The moving horizon method can be viewed as a compromise between these two methods. This problem can be viewed in the following way; given the set of stabilizable time varying (A;B) pairs of xed dimension, does there exist a map to the set of time varying state feedback laws such that for each (A;B) pair the resulting closed loop is exponentially stable, and the corresponding controller depends only causally on (A;B). 94 6.2 State feedback 95 That is, is it possible, given only information about the past behaviour of the A and B matrices, to decided upon the current input based only on knowledge of the system matrices up until now? The moving horizon strategy does not do quite this, since it uses information about the system matrices up to some xed time T ahead, and indeed the system has to satisfy other conditions for all time. However, we shall show that not only is the resulting closed loop exponentially stable, but it is also possible to construct such controllers so that the closed loop satis es a prespeci ed induced norm bound. The quadratic moving horizon method described above is motivated as a state feedback strategy, although since it has been shown to be equivalent to an in nite horizon regulator problem for certain choices of terminal weight Qf , it has also been used as an output feedback strategy with a Kalman lter providing the state estimates. In this chapter, a receding horizon controller is formulated with each nite horizon optimization based upon an H1 optimization. It is hoped that the advantages of receding horizon control might be combined with the robustness advantages of H1 control. Tadmor [68] proved, under certain assumptions and with a terminal constraint, that a similar controller was stable and satis ed an in nite horizon norm bound. We generalize these results in the state feedback case and construct a new controller for the output feedback case. In the observation feedback case it is di cult to produce a natural formulation of the H1 receding horizon control problem. For the nite horizon problem, typically the assumptions made are that the initial state at the beginning of each optimization interval is completely known, or is completely unknown. In the latter case it is treated as part of the disturbance and subject to a quadratic weighting [72] [42]. In the receding horizon problem, though, we have observations from before the optimization interval. We rst consider the state feedback problem, with the above quadratic cost function replaced by the game theoretic cost function of previous chapters. We will then consider the output feedback case. 6.2 State feedback We will again consider the continuous time system _ x(t) = A(t)x(t) +B1(t)w(t) +B2(t)u(t) z(t) = C1(t)x(t) +D12(t)u(t) y(t) = C2(t)x(t) +D21(t)w(t) (6.1) de ned onR+. Again we will make the nonsingularity assumptions thatD12(t)0D12(t) = I and D21(t)D21(t)0 = I for all t 2 R+. We will also assume that the system matrices are all bounded functions of time. 6.2.1 The moving horizon di erential game The principle idea behind moving horizon control is that of solving a nite horizon optimization problem at each time t 0, over the interval [t; t + T ]. In this way, at each time t the resulting controller depends only on the plant parameters up to a nite time ahead. 6.2 State feedback 96 We will consider controllers in the space of continuous time state feedback laws given byU n;m2 sf [t1; t2] := nv : [t1; t2] Rn ! Rm2 ; 8 x 2 L n 2 [t1; t2]; v( ; x( )) 2 L m2 2 [t1; t2]o: The moving horizon problem we will solve, at each time t > 0, will be inf 2Usf [t;t+T ] sup w2L2[t;t+T ] Z t+T t z(s)0z(s) 2w(s)0w(s) ds+x(t+T )0F (t+ T )x(t+T ) (6.2) where F (t + T ) > 0 is some given positive de nite weighting matrix on the terminal state. At each time t we construct the state feedback law u(t) = (t; x(t)). However, ideally this control law is only implemented instantaneously, at time t, although in practice it might be implemented on a short nite interval [t; t + ]. At time t + , a new optimization would be performed over the interval [t+ ; t+ +T ]. In this chapter we will assume that the controller is updated continuously, and in Chapter 7 we will discuss the case when discrete updates are performed. For each separate optimization problem, the initial state may not be zero, and hence a direct induced norm interpretation on each nite interval is not possible, since zero input u would give a nonzero output z on the interval [t; t + T ], and thus the induced norm on this interval is unde ned. The terminal state penalty F is often incorporated into nite horizon control problems to allow for compromises between the norm of z and the size of the nal state x(tf ). In the moving horizon case, we will show that particular choices of F guarantee closed loop stability. An expression for the moving horizon controller. From the results of Chapter 3, we can solve the game theoretic problem explicitly on each interval [t; t+ T ]. There exists a saddle point solution to the nite horizon state feedback problem (6.2) if and only if there exists a bounded solution to the Riccati di erential equation _ X1 = (A B2D0 12C1)0X1 +X1(A B2D0 12C1) X1(B2B0 2 2B1B0 1)X1 + C 0 1D? 12D?0 12C1 (6.3) on [t; t + T ] with boundary condition X1(t + T ) = F (t + T ). This result can also be found in [50]. For the moving horizon problem, we will write the solution X1 to this Riccati di erential equation on the interval [t1; t2] with boundary condition X1(t2) = F as X1(t1; t2; F ). Then, solving the problem over each interval [t; t + T ], it is clear that this must satisfy the matrix partial di erential equation @ @tX1(t; s; F (s)) = (A B2D0 12C1)(t)0X1(t; s; F (s)) +X1(t; s; F (s))(A B2D0 12C1)(t) X1(t; s; F (s))(B2B0 2 2B1B0 1)(t)X1(t; s; F (s)) + C 0 1D? 12D?0 12C1(t) (6.4) 6.2 State feedback 97 for s t 0, with boundary condition X1(s; s; F (s)) = F (s) for all s T: The state feedback moving horizon controller is then given by u(t) = B2(t)0X1(t; t+ T; F (t+ T )) +D12(t)0C1(t) x(t): Note that, in the case when the original system is linear time invariant, then the resulting controller is time invariant also. We will use the following lemmas to analyse the moving horizon H1 controller. Lemma 6.1. Suppose F is a bounded positive solution of _ F + (A B2D0 12C1)0F + F (A B2D0 12C1) F (B2B0 2 2B1B0 1)F + C 0 1(I D12D0 12)C1 0 (6.5) on the interval [t1; t2], and X1 is a bounded positive solution of equation (6.3) on the interval [t1; t2]. Further suppose that X1(t2) = F (t2). Then X1(t) < F (t) for t 2 [t1; t2]. Proof. Using the completion of squares argument of Theorem 3.14, we have x(t)0X1(t)x(t) = inf 2Usf [t;t2] sup w2L2[t;t2]Z t2 t z(s)0z(s) 2w(s)0w(s) ds+ x(t2)0F (t2)x(t2) andx(t)0F (t)x(t) = inf 2Usf [t;t2] sup w2L2[t;t2] Z t2 t jz(s)j2 + x(s)0 (s)x(s) 2jw(s)j2 ds + x(t2)0F (t2)x(t2) for some matrix function 0. Since f(u; w) g(u; w) =) inf u2Usf sup w2L2 f(u; w) inf u2Usf sup w2L2 g(u; w) we have immediately the required result. Lemma 6.2. The solution X1 to the Riccati di erential equation (6.4) satis es, for all t3 t2 t1 0 and for all F 0, a) X1(t1; t2; X1(t2; t3; F )) = X1(t1; t3; F ) b) F > 0 =) X1(t1; t2; F ) > 0 c) C 0 1D? 12D?0 12C1 > 0 =) X1(t1; t2; F ) > 0 d) F1 F2 0 =) X1(t1; t2; F1) X1(t1; t2; F2) 6.2 State feedback 98 Proof. a) This is a straightforward consequence of the de nition of X1. b) See Proposition 3.19. c) See Proposition 3.19. d) This follows, similarly to Lemma 6.1, from f(u; w) g(u; w) =) inf u2Usf sup w2L2 f(u; w) inf u2Usf sup w2L2 g(u; w) 6.2.2 Stability of the receding horizon controller For the linear quadratic problem with a terminal weight, the closed loop was shown to be stable under certain conditions by Kwon, Bruckstein and Kailath [45]. The problem when F =1, that is when the terminal state is constrained to be zero was considered by Tadmor [68]. The techniques we will use are similar to those in [45]. We will rst need the following Lemma. Related results can be found in [45, 2, 59, 17, 57, 7]. Lemma 6.3. Suppose F satis es _ F + (A B2D0 12C1)0F + F (A B2D0 12C1) F (B2B0 2 2B1B0 1)F + C 0 1(I D12D0 12)C1 0 (6.6) and F (t) 0 for all t 0. Suppose also that for each t 0, there exists a bounded nonnegative de nite solution to the Riccati di erential equation (6.3) on the interval [t; t+ T ], with boundary condition X1(t+ T ) = F (t+ T ). Then X1(t1; t2; F (t2)) X1(t1; t3; F (t3)) for all t3 t2 t1 0. That is, X1(t; s; F (s)) is non-increasing with respect to s. Proof. By de nition, X1(t3; t3; F (t3)) = F (t3). Then from Lemma 6.1, we have X1(t; t3; F (t3)) F (t) for all t t3. Therefore, for t2 t3, X1(t2; t3; F (t3)) F (t2): If we consider this equation at time t2 as a boundary condition, and apply Lemma 6.2(d), we see X1 t1; t2; X1(t2; t3; F (t3)) X1 t1; t2; F (t2) and hence, applying Lemma 6.2(a), X1(t1; t3; F (t3)) X1(t1; t2; F (t2)) 6.2 State feedback 99 The following theorem is the required stability result. Theorem 6.4. Suppose the pair (A B2D0 12C1; D?0 12C1) is detectable, the pair (A;B2) is stabilizable, and let F satisfy (6.6), with F (t) 0 for t 0. Suppose furthermore that for each t 0, there exists a bounded nonnegative de nite solution to the Riccati di erential equation (6.3) on the interval [t; t+T ], with boundary condition X1(t+T ) = F (t + T ). Then, if there exists > 0 such that X1(t; t + T; F (t + T )) < I for all t 0, then the control law u(t) = B2(t)0X1(t; t+ T; F (t+ T )) +D12(t)0C1(t) x(t): (6.7) applied to the system (6.1) results in an exponentially stable closed loop. Proof. We will write X(t) = X1(t; t+T; F (t+T )). Then, with the controller in place, the homogeneous part of closed loop dynamics is given by _ x(t) = (t)x(t), where (t) = A(t) B2(t)B2(t)0X(t) B2(t)D12(t)0C1(t): The matrix X(t) satis es _ X + (A B2D0 12C1)0X +X(A B2D0 12C1) X(B2B0 2 2B1B0 1)X + C 0 1(I D12D0 12)C1 @ @sX1(t; s; F (s)) s=t+T = 0 (6.8) which we can rewrite as _ X + 0X +X +XB2B0 2X + 2XB1B0 1X + C 0 1(I D12D0 12)C1 @ @sX1(t; s; F (s)) s=t+T = 0: Lemma 6.3 implies @ @sX1(t; s; F (s)) s=t+T 0 and if we write ~ C = D?0 12C1 B0 2X then X satis es _ X + 0X +X ~ C 0 ~ C: Since (A B2D0 12C1; D?0 12C1) is detectable, ( ; ~ C) is detectable also, and thus applying Lemma 5.1 gives the desired result. Remark 6.5. Note that, although in the above proof we have assumed di erentiability of X1(t; s; F (s)) with respect to s, it is straightforward to extend the result to the case when the di erentiability assumption does not hold. In this case, the di erential equations for X are replaced by integral equations. This technique is used by Tadmor [69] to solve the general linear H1 problem. 6.2 State feedback 100 Remark 6.6. Clearly the assumption that (A;B2) is stabilizable is necessary for the existence of a stabilizing state feedback linear time varying controller. However, we have not used it in the above argument. This implies that, if (A B2D0 12C1; D?0 12C1) is detectable, existence of a positive solution X to the Riccati equation (6.8) implies that (A;B2) is stabilizable. Clearly, the condition that F satisfy equation (6.6) is not necessary for exponential stability. Nor indeed is the condition that @ @sX1(t; s; F (s)) s=t+T 0 since this might correspond to a di erent C1 matrix, for which C 0 1(I D12D0 12)C1 @ @sX1(t; s; F (s)) s=t+T > 0: 6.2.3 In nite horizon norm bounds Theorem 6.7. If the conditions of Theorem 6.4 hold, and x0 = 0, then the controller (6.7) results in kzk22 2kwk22 0 for all w 2 L2[0;1) where the 2-norms are de ned on L2[0;1). That is, with this controller in place, the closed loop induced norm satis es kTzwk . Proof. Completing the square as in Section 3.4.3 gives Z 1 0 d dt x(t)0X(t)x(t) + z(t)0z(t) 2w(t)0w(t) dt = Z 1 0 ju+B0 2Xx+D0 12C1xj2 2jw 2B0 1Xxj2 + x0 @ @sX1(t; s; F (s)) s=t+T x dt Therefore, with u = B0 2Xx D0 12C1x, writing w = 2B0 1Xx, since the closed loop is exponentially stable, and by assumption X( ) is bounded, we have x00X(0)x0 + kzk22 2kwk22 = 2kw w k22 + Z 1 0 x0 @ @sX1(t; s; F (s)) s=t+T x dt: Since x0 = 0, using Lemma 6.3, we arrive at kzk22 2kwk22 0 for all w 2 L2[0;1) as desired. This is the key result of this section. It shows that, using only the conditions speci ed in Theorem 6.4 on the terminal state weight F (t), we can achieve an induced norm bound over the whole of future time. There are a number of questions we would like to ask about this scheme. The rst is, when does a matrix F satisfying these conditions exist? 6.2 State feedback 101 Proposition 6.8. For a given time varying system of the form (6.1) such that (A;B2) is stabilizable and (A B2D0 12C1; D?0 12C1) is detectable, if there exists a stabilizing state feedback controller 2 Usf [0;1) such that the closed loop system has kTzwk < , then there exists a solution to the Riccati di erential equation _ F + (A B2D0 12C1)0F + F (A B2D0 12C1) F (B2B0 2 2B1B0 1)F + C 0 1(I D12D0 12)C1 = 0 This is proved in [43]. This implies that, for any given time varying system, if there exists a -feasible stabilizing state feedback controller, then there exists a -feasible state feedback moving horizon controller. Of course, this statement is in fact not very useful, since the moving horizon controller would be the in nite horizon controller. In the construction of this controller, it would be necessary to use the exact model of the plant over all future time. Linear time invariant systems. Similar conditions, which make use of complete knowledge of the future plant to construct moving horizon controllers, are well known for linear time invariant plants for quadratic regulator problems. One in particular is the so-called `fake' algebraic Riccati equation, introduced by Poubelle et al.[56]. However, Poubelle considered the linear-quadratic moving horizon problem, for which the conditions required for stability are simpler. We derive here a similar result for the H1 problem as follows. Lemma 6.9. Consider the LTI system (6.1) where all matrices are constant. Suppose that (A;B2) is stabilizable and (A B2D0 12C1; D?0 12C1) is detectable, in the time invariant sense. Let F > 0 be a xed symmetric matrix. Suppose for each t 0 there exists a saddle point for the nite horizon di erential game inf 2Usf [t;t+T ] sup w2L2[t;t+T ]Z t+T t z(r)0z(r) 2w(r)0w(r) dr + x(t + T )0Fx(t+ T ): and write X := X1(t; t + T; F ) := @ @tX1(t; s; F ) s=t+T : Then if 0, and if A B2D0 12C1 (B2B0 2 2B1B0 1)X is stable, then the moving horizon controller u(t) = (B2X1(t; t+ T; F ) +D0 12C1)x(t) (6.9) stabilises the closed loop system and results in kTzwk . Proof. Di erentiation implies = _ X @ @sX1(t; s; F ) s=t+T and hence X satis es (A B2D0 12C1)0X +X(A B2D0 12C1) X(B2B0 2 2B1B0 1)X + C 0 1(I D12D0 12)C1 + = 0: 6.2 State feedback 102 If we now consider the new output ~ z = ~ C1x + ~ D12u where ~ C1 = C1 12 ~ D12= D12 0 : then X satis es the `fake' algebraic Riccati equation (A B2 ~ D0 12 ~ C1)0X +X(A B2 ~ D0 12 ~ C1) X(B2B0 2 2B1B0 1)X + ~ C 0 1(I ~ D12 ~ D0 12) ~ C1 = 0: (6.10) Then, from [22], since X is a stabilising solution to this equation, with the controller given by equation (6.9) the closed loop norm from w to ~ z must be strictly less than . Since k~ zk22 kzk22 for any w, the original system has an induced norm strictly bounded by also. In other words, if the solution to the nite horizon Riccati di erential equation is non-decreasing in the above sense, then the moving horizon controller will be -feasible and, for the LTI problem, the above result gives an alternative proof to Theorem 6.7. Further, we know how conservative such a controller will be, since not only is it feasible for the original system (6.1), but it is also -feasible for the system with the larger output ~ z. Hence the rate of increase of X1, as measured by , is important in assessing the level of conservatism of the performance achieved by the moving horizon controller. The following Lemma gives a condition for to be decreasing. Lemma 6.10. Let A, R and Q be constant matrices, and let X satisfy _ X = A0X +XA XRX +Q on the interval [t0; tf ], with boundary condition X(tf) = Qf . Then, if _ X(tf ) 0, then X is non-increasing for all t 2 [t0; tf ]. Similarly, if _ X(tf) 0, then X is non-decreasing for all t 2 [t0; tf ]. Proof. Di erentiating gives  X = (A RX)0 _ X + _ X(A RX) which is a Lyapunov equation, and so applying Proposition 3.19 to this equation, we arrive at  X = (tf ; t)0 _ X(tf ) (tf ; t) and the result follows. We can use this result to immediately give the LTI version of Theorem 6.7. For linear time invariant systems, choosing a xed F such that (A B2D0 12C1)0F + F (A B2D0 12C1) F (B2B0 2 2B1B0 1)F + C 0 1(I D12D0 12)C1 0 6.2 State feedback 103 will ensure that @ @tX1 t; s; F s=t+T 0 for t 0. Then, provided is chosen su ciently large, the resulting moving horizon controller u(t) = (B2X1(t; t+ T; F ) +D0 12C1)x(t) will be a -feasible H1 controller for the original problem. Further, Lemma 6.9 tells us exactly how much above the optimal must be chosen. An important point to note is that, if X is the solution to the algebraic Riccati equation (A B2D0 12C1)0 X + X(A B2D0 12C1) X(B2B0 2 2B1B0 1) X + C 0 1(I D12D0 12)C1 = 0 then F > X is not su cient for X1(t; t + T; F ) to be stabilizing. Counterexamples to this, in the discrete time case for the regulator problem, are given in [13], and it is straightforward to construct similar examples in the continuous time case. 0 0:04 0:08 0:12 0:16 0:2 0:24 0:28 0:32 0:36 0:4 0 0:2 0:4 0:6 0:8 1 1:2 1:4 Interval length A = 24 0:0100 0 30:0000 0 0:0100 15:0000 1:0000 1:0000 0 35 B1 = 249:3969 0 0 0 0 035 B2 = 246:9500 00 35 C1 = 0 0 0 0 16:3469 0 D12 = 10 Figure 6.1: Optimal gamma versus interval length for a nite horizon problem Figure 6.1 shows the variation in optimal achievable gamma by state feedback against horizon length for a linear time invariant system. The open loop poles of this system are 0:0050+ 6:7082i, 0:0050 6:7082i, and 0:0100, and the open loop frequency response is shown in Figure 6.2. The system is taken from a spring damper example in [29]. The resonance shown has a period of about 0:93 seconds. However, we can see that, for the state feedback 6.3 Measurement feedback 104 gain (dB) 103 102 101 100 101 102 60 40 20 0 20 40 60 80 frequency (rad/s) Figure 6.2: Open loop frequency response. synthesis, the optimal achievable gamma is about constant for intervals greater than 0:4 seconds. The above analysis for LTI systems is not our main concern, however. It allows construction of anH1 -feasible stabilizing controller, but there are already well known techniques for this. It gives two extra `tuning' parameters, but in practice these are of little value, since we have no idea how to tune them. Further, there are known design techniques using weights as tuning parameters for the H1 problem, and the relationship between the weights and the resulting closed loop properties are far clearer than the relationship F and T have to the closed loop properties. Indeed, by increasing F we may actually cause the closed loop to become unstable. It would appear therefore, that the main utility of these results is for time varying systems. Before investigating further interpretations, we will turn to the output feedback problem. 6.3 Measurement feedback In this section, we will combine the output feedback results of Chapter 5 with the state feedback results of the previous section. In order to do this, we will derive results for the ltering Riccati equation which are analogous to those of the the previous section for the control Riccati equation. Let G be the nite dimensional linear time varying system with state space realization _ x(t) = A(t)x(t) +B1(t)w(t) +B2(t)u(t) x(0) = 0 z(t) = C1(t)x(t) +D12(t)u(t) y(t) = C2(t)x(t) +D21(t)w(t) (6.11) de ned on R+, with all matrices bounded. We will make the following assumptions about G. 6.A1) D12(t)0D12(t) = I for all t 2 R+ 6.3 Measurement feedback 105 6.A2) D21(t)D21(t)0 = I for all t 2 R+ 6.A3) (A;B2) is stabilizable 6.A4) (A;C2) is detectable 6.A5) (A B2D0 12C1; D?0 12C1) is detectable The stabilizability and detectability assumptions are not necessary for the nite horizon problem, but will of course be necessary for the moving horizon problem in order to ensure stability. As before, the control Riccati di erential equation is _ X1 = (A B2D0 12C1)0X1 +X1(A B2D0 12C1) X1(B2B0 2 2B1B0 1)X1 + C 0 1(I D12D0 12)C1 (6.12) and we restate the de nitions needed for the ltering Riccati equation. Az = A B1D0 21C2 + 2B1D?0 21D? 21B0 1X Ck = D0 12C1 B0 2X Cz = C2 + 2D21B0 1X: and the Riccati di erential equation is _ Z1 = AzZ1 + Z1A0z Z1(C 0 zCz 2C 0 kCk)Z1 +B1D?0 21D? 21B0 1: (6.13) A solution to the output feedback nite horizon H1 problem is given by the following theorem. In this case, X = X1. Theorem 6.11. For the linear time varying system G de ned by equations (6.11) on the nite horizon [0; T ], a controller K exists giving kTzwk < , where the norm is the nite horizon induced L2 norm, if and only if there exist positive de nite bounded solutions to the Riccati di erential equations (6.12) and (6.13) on the interval [0; T ]. In this case, one such controller is given by_ x = Ak x+Bky x(0) = 0 u = Ck x where Ak = A+ 2B1B0 1X +B2Ck (B1D0 21 + Z1C 0 z)Cz Bk = B1D0 21 + Z1C 0 z: Proof. A variant on this theorem is proved in Chapter 3. See also Limebeer, Anderson, Khargonekhar and Green [50]. We will write the solution Z1 to the Riccati di erential equation (6.13) on the interval [t1; t2] with boundary condition Z1(t1) = G as Z1(t2; t1; G), which therefore satis es the matrix partial di erential equation @ @tZ1(t; s; G(s)) = Az(t)Z1(t; s; G(s)) + Z1(t; s; G(s))Az(t)0 Z1(t; s; G(s))(C 0 zCz 2C 0 kCk)(t)Z1(t; s; G(s)) +B1D?0 21D? 21B1(t)0: (6.14) 6.3 Measurement feedback 106 for t s, with boundary condition Z1(t; t; G(t)) = G(t) for all t 0: We now state the following Lemmas, directly analogous to Lemmas 6.1 and 6.2. Lemma 6.12. Suppose G is a bounded positive solution of _ G + AzG+GA0z G(C 0 zCz 2C 0 kCk)G+B1D?0 21D? 21B0 1 0 (6.15) on the interval [t1; t2], and Z1 is a bounded positive solution of equation (6.13) on the interval [t1; t2]. Further suppose that Z1(t1) = G(t1). Then Z1(t) G(t) for all t 2 [t1; t2]. Proof. By substituting t̂ = t2 t + t1, we arrive at an equation in the same form as equation (6.3). Hence we can apply Lemma 6.1 and the result follows. Lemma 6.13. The solution Z1 to the Riccati di erential equation (6.14) satis es, for all t3 t2 t1 and for all G 0, a) Z1(t1; t2; Z1(t2; t3; G)) = Z1(t1; t3; G) b) G > 0 =) Z1(t1; t2; G) > 0 c) B1D?0 21D? 21B1(t)0 > 0 =) X1(t1; t2; G) > 0 d) G1 G2 0 =) Z1(t1; t2; G1) Z1(t1; t2; G2) Proof. The proof of this follows exactly that of Lemma 6.2, with the only slight di culty being part d). This can be transformed into the form of Lemma 6.2 part d) by de ning ~ Z = Z1. We can now prove the ltering analogue of Lemma 6.3. Lemma 6.14. Suppose G satis es the Riccati di erential inequality _ G + AzG+GA0z G(C 0 zCz 2C 0 kCk)G+B1D?0 21D? 21B0 1 0 (6.16) and G(t) 0 for all t T2. Suppose also that, for each t T2, there exists a bounded nonnegative de nite solution to the Riccati di erential equation (6.14) on the interval [t; t T2], with boundary condition Z1(t T2) = G(t T2). Then Z1(t1; t2; G(t2)) Z1(t1; t3; G(t3)) for all T2 t3 t2 t1. Proof. The proof follows exactly the lines of Lemma 6.3, with a few inequalities reversed. In particular, note that the order T2 t3 t2 t1 is reversed from that used in the state feedback expressions. The details follow. By de nition, Z1(t3; t3; G(t3)) = G(t3), and then applying Lemma 6.12 we have Z1(t; t3; G(t3)) G(t) 6.3 Measurement feedback 107 for all t t3. Thus, for t2 t3,Z1(t2; t3; G(t3)) G(t2): From Lemma 6.13d), we know that G1 G2 implies that Z1(t; t2; G1) Z1(t; t2; G2) for t2 t. Hence Z1(t1; t2; Z1(t2; t3; G(t3))) Z1(t1; t2; G(t2)) and applying Lemma 6.13a), we have Z1(t1; t3; G(t3)) Z1(t1; t2; G(t2)): Remark 6.15. Note that this implies that @ @sZ1(t; s; G(s)) 0 for t s. The following Lemma will allow us to construct output feedback moving horizon H1 controllers. Lemma 6.16. Suppose G satis es (6.16), and let Z(t) = Z1(t; t T2; G(t T2)). Then Z satis es the Riccati di erential inequality _ Z AzZ + ZA0z Z(C 0 zCz 2C 0 kCk)Z +B1D?0 21D? 21B0 1: (6.17) Proof. Di erentiating Z gives _ Z = AzZ + ZA0z Z(C 0 zCz 2C 0 kCk)Z +B1D?0 21D? 21B0 1 + @ @sZ1(t; s; G(s)) s=t T2 Applying Lemma 6.14 gives the desired result. Note that in the coe cients Ak, Ck, and Cz, it does not matter whether X is de ned by the moving horizon control Riccati equation or by an in nite horizon solution. Also we can easily ensure that the inequalities are satis ed uniformly, since by adding "I to the term B1D?0 21D? 21B0 1, it is clear that if G satis es the Riccati di erential inequality uniformly, and Z1(t) is de ned on the nite horizon by _ Z1 = AzZ1 + Z1A0z Z1(C 0 zCz 2C 0 kCk)Z1 +B1D?0 21D? 21B0 1 + "I then Z1(t; t T2; G(t T2) will satisfy the Riccati di erential inequality (6.17) uniformly also. 6.3 Measurement feedback 108 The output feedback controller. We now construct the output feedback controller. The interpretation here is that we would like to solve the di erential game problem inf 2Uof [t;t+T ] sup w2L2[t;t+T ] Z t+T t z(s)0z(s) 2w(s)0w(s) ds+ x(t + T )0F (t+ T )x(t + T ): However, we no longer can solve this problem conditional on the state at time t, since we do not have this information. In the moving horizon problem for the quadratic regulator, this problem is usually solved by assuming a Kalman lter provides the state estimate. However, we would like to guarantee an in nite horizon H1 norm bound, and so the assumption of w being white noise does not t well with this objective. Therefore we now solve, at time t, the new di erential game inf 2Uof [t T2;t+T ] sup w2L2[t T2;t+T ] x(t T2)2Rn 2x(t T2)0H(t T2)x(t T2) + Z t+T t T2 z(s)0z(s) 2w(s)0w(s) ds+ x(t + T )0F (t+ T )x(t+ T ) : The matrix H(t T2) > 0 now represents our assumed uncertainty in the state at time t T2. By applying Lemma 6.16 and Theorem 5.17, we can now state the output feedback moving horizon result in full. Theorem 6.17. Let the system G be de ned by equations (6.11) satisfying assumptions (1)(5). Suppose F 2 L n n 1 satis es _ F + (A B2D0 12C1)0F + F (A B2D0 12C1) F (B2B0 2 2B1B0 1)F + C 0 1(I D12D0 12)C1 0 and G 2 L n n 1 satis es _ G+ AzG+GA0z G(C 0 zCz 2C 0 kCk)G+B1D?0 21D? 21B0 1 "I (6.18) with F 0 and G 0. Suppose also that for each t 0, there exists a bounded solution to the Riccati di erential equation (6.3) on the interval [t; t+T ] with boundary condition X1(t + T ) = F (t + T ), and de ne X(t) = X1(t; t + T; F (t + T )) by the PDE (6.4). Suppose further that, for some constant " > 0, for each t 0, there exists a bounded solution to the Riccati di erential equation _ Z1 = AzZ1 + Z1A0z Z1(C 0 zCz 2C 0 kCk)Z1 +B1D?0 21D? 21B0 1 + "I on the interval [t T2; t], with boundary condition Z1(t T2) = G(t T2), and de ne Z(t) = Z1(t; t T2; G(t T2)) as that solution. Then the controller _ x = Ak x+Bky x(0) = 0 u = Ck x 6.3 Measurement feedback 109 exponentially stabilizes the closed loop, and results in a closed loop induced norm bound kTzwk . In terms of the moving horizon Riccati di erential equations, the parameters are Az = A B1D0 21C2 + 2B1D?0 21D? 21B0 1X Ck = D0 12C1 B0 2X Cz = C2 + 2D21B0 1X Ak = A+ 2B1B0 1X +B2Ck (B1D0 21 + ZC 0 z)Cz Bk = B1D0 21 + ZC 0 z: Remark 6.18. Note that in the above solution, we require knowledge of the system matrices A, B, C, D from before the time the controller is implememented, on the interval [ T2; 0]. This is a somewhat arti cial restriction of the method. In fact, the matrix H(t T2) is given by H(t T2) = G(t T2) 1 + 2X1(t T2; t+ T1; F (t+ T1)) using equations (3.30) and (3.24). This implicit choice of initial uncertainty of the state ensures that the usual coupling condition X1(t) 2Y1 1(t) < 0 is satis ed by any positive choice of G satisfying (6.18). This theorem shows that the moving horizon technique gives a stabilizing -feasible controller for a linear time varying system. In some sense, however, the proof illustrates an apparent aw in this methodology. If we can construct F and G, why not simply use them as X and Z in the controller formulae? In fact, why not simply construct F and G by integrating forwards in time the Riccati equations? One reason is that the nite horizon Riccati di erential equations converge to the solutions of the in nite horizon Riccati di erential equations, so we do not expect the solution to be sensitive to the choice of F and G. we shall return to consideration of these points in the sequel. In the meantime, we would expect to gain something by using recent past and future information about the system to construct the matrices F and G. Heuristically, for the moving horizon controller, we expect a large choice of F to give stability, since this then corresponds to a large terminal state weighting matrix. Indeed, our intuition from the nite horizon state feedback problem is as follows. For the purpose of this discussion, consider a linear time invariant system, and the construction of a state feedback controller with a xed terminal weight on a nite horizon [0; T ]. Then we know that we can choose any positive semi-de nite terminal state weight, Qf , and construct a nite horizon -feasible H1 controller. However, we also know that we are not restricted to choosing solutions to the equation, but we can also choose solutions to the inequality. Further, since we are integrating backwards, given any speci c nal value Qf , any trajectory of X which increases su ciently fast backwards in time will be a solution of the inequality, and hence will provide a -feasible H1 controller. Figure 6.3 shows typical solutions for the control Riccati equation for a 3rd order system. The system here is taken from [29, p. 299]. The nite horizon is [0; 1], and the nal conditions are all multiples of the in nite horizon stabilizing solution. In this particular example, it is easy to see that, since the stabilizing solution to the algebraic Riccati equation is positive de nite, and in this case the term B2B0 2 2B1B0 1 is sign 6.3 Measurement feedback 110 (X1) 0 0:1 0:2 0:3 0:4 0:5 0:6 0:7 0:8 0:9 1 0 0:5 1 1:52 2:5 3 Time A = 24 0:0100 0 30:0000 0 0:0100 15:0000 1:0000 1:0000 0 35 B1 = 24 1:2659 0 0 0 0 035 B2 = 24 31:6481 00 35 C1 = 0 0 0 0 1:3166 0 D12 = 10 X1 = 240:0140 0:0283 0:0948 0:0283 0:3142 0:4776 0:0948 0:4776 1:275935 in steady state Figure 6.3: Solutions to the Riccati di erential equation on [0; 1]. de nite, applying Lemma 6.10 implies that all solutions with a scalar multiple of that solution as a nal boundary condition will be monotonic. The graph shows only the maximum singular value of the solutions. Figure 6.4 illustrates possible solutions to the Riccati di erential inequality on [0; 1]. In this gure, at times 0; 0:1; : : : ; 1, the Riccati di erential equation has been perturbed by adding 0:05I to _ X1. Integrating backwards, we can always add to the solution X1, but we cannot subtract from it. A game theoretic interpretation of this statement will also be given in the next chapter. Figure 6.5 illustrates a solution to the Riccati di erential inequality, which satis es the equation for t 62 [0:75; 0:8]. Within this interval, it is perturbed by 0:05I. Note that, after the perturbation, the solution rapidly converges back to the steady state solution. For linear time invariant systems, Theorem 6.17 implies that we can choose any of these boundary conditions Qf which result in monotonic solutions for the Riccati equation as a constant choice for X1, with a corresponding result for Z1. The resulting controller will be -feasible and stabilizing. Lemma 6.10 gives a computable condition for LTI systems which allows us to nd monotonic solutions. That is, in order to nd monotonic solutions to the Riccati di erential equation for X1, we need to nd Qf satisfying (A B2D0 12C1)0QF +Qf(A B2D0 12C1) Qf(B2B0 2 2B1B0 1)Qf + C 0 1(I D12D0 12)C1 < 0: 6.4 Terminal constraints 111 (X1) 0 0:1 0:2 0:3 0:4 0:5 0:6 0:7 0:8 0:9 1 1:45 1:5 1:55 1:6 1:65 1:7 1:75 1:8 1:85 Time Figure 6.4: Some solutions to the Riccati di erential inequality on [0; 1]. The importance of this inequality is well known in the LMI approach to H1 control, and the above inequality can be written NR 0 0 I 0 24AQ 1 f +Q 1 f A Q 1 f C 0 1 B1 C1Q 1 f I 0 B0 1 0 I35 NR 0 0 I < 0 where NR denotes a basis for the null space of B0 2 D0 12 . See [26] for more details on this approach. Here we have given a di erent intuition to the LMI solution for linear time invariant problems, and extended this approach to cover linear time varying systems also. 6.4 Terminal constraints We therefore might expect that one possible method for constructing F would be to construct the solution with an in nite terminal state weight. This corresponds to the constraint that, for the nite horizon dynamic game on [t; t + T ], the system state is brought to zero at time t+ T . We can achieve this in the minimax problem by putting an in nite weight on the state at time t+T , that is by choosing Qf =1. The following Lemma considers the Riccati equation for X 1, to avoid the singularity at the endpoint. Lemma 6.19. Suppose for i = 1; 2 that Ri is a bounded solution to the Riccati di erential equation _ Ri = (A B2D0 12C1)Ri +Ri(A B2D0 12C1)0 +RiC 0 1D? 12D?0 12C1Ri (B2B0 2 2B1B0 1) on the interval [t1; t2], with R1(t2) = Q1 and R2(t2) = Q2, and Q1 Q2 0. Then R1(t) R2(t) for all t 2 [t1; t2]. 6.4 Terminal constraints 112 (X1) 0 0:1 0:2 0:3 0:4 0:5 0:6 0:7 0:8 0:9 1 1:1 1:2 1:3 1:4 1:5 1:45 1:5 1:55 1:6 1:65 1:7 1:75 1:8 Time Figure 6.5: A solution to the Riccati di erential inequality. Proof. Note that although R = X 1 1 , we cannot simply apply Lemma 6.2, since we speci cally want to apply this to the case Q2 6> 0. In order to do this, we use the results of Brockett [16, p. 130] on linear quadratic regulation with an inde nite cost on the state. If Ri is a bounded solution to the Riccati di erential equation then x(t)0Ri(t)x(t) = min u2L2[t;t2]Z t2 t x(s)0(B2B0 2 2B1B0 1)(s)x(s) + u(s)0u(s) ds+ x(t2)0Qix(t2) where x is subject to the dynamics _ x(t) = (A B2D0 12C1)0(t)x(t) + C 0 1D? 12u: Clearly, since this is a minimization problem, and the cost function is greater for i = 1 than for i = 2, we have R1(t) R2(t). Lemma 6.20. Let 0 t1 < t2 t3. Suppose that, for s = t2 and s = t3, there exists a bounded solution R 2 L n n 1 to the Riccati di erential equation _ R = (A B2D0 12C1)R +R(A B2D0 12C1)0 +RC 0 1D? 12D?0 12C1R (B2B0 2 2B1B0 1) (6.19) on the interval [t1; s] such that such that R(s) = 0 and R(t) > 0 for t1 t < s. Let R(t; s) be the solution at time t with boundary condition at time s, de ned on the interval [t1; s]. Then R(t; t3) 1 R(t; t2) 1 for all t 2 [t1; t2). Further, if for all t 0, there exists a solution R(t; t+T ) satisfying the above conditions, then X(t) = R(t; t+T ) 1 satis es the control Riccati inequality (6.12). 6.4 Terminal constraints 113 Proof. By assumption, R(t2; t3) > 0, and R(t2; t2) = 0. Therefore, in the same way as the proof of Lemma 6.3, we can use time t3 as a boundary condition. Then, Lemma 6.19 implies that R(t; t2) R(t; t3) for all t1 t t2. The last part of the Lemma then follows by direct di erentiation. Lemma 6.21. Let G be an exponentially stable linear time varying system de ned on [0;1) by _ x(t) = A(t)x(t) +B(t)w(t) z(t) = C(t)x(t) +D(t)w(t) and de ne kGk[t;s] as the nite horizon induced norm of the system on the interval [t; s] with state x(t) = 0. Then there exists > 0 such that kGk[t;s] < for all 0 t < s. Proof. Since G is exponentially stable, then kGk is bounded. Clearly kGk kGk[t;s] for all 0 t < s, since the worst case w for the nite horizon problem can be used to generate the same output for the in nite horizon problem. The above Lemma does not imply that the induced norm is bounded for each nite horizon problem with a terminal constraint. We will use the following sequence of Lemmas to achieve this. Lemma 6.22. For given bounded matrix functions A, B, C, D, there exists > 0 such that there exists a bounded solution to the Riccati di erential equation (6.19) on the nite interval [t1; t2] with boundary condition R(t2) = 0. Proof. This is proved in [7, Proposition 8.5]. Lemma 6.23. If there exists a solution to equation (6.19) on the nite interval [t1; t2] with boundary condition R(t2) = 0, then a solution exists for all A, B, C, D, and in an open neighbourhood of the nominal, and the solution R is uniformly continuous in the state space matrices A, B, C, D and in . Proof. A proof of continuity for general nonlinear equations satisfying a local Lipschitz condition is in Sontag [61]. For speci c details for nite horizon Riccati equations, see also Ba sar and Bernhard [7, Chapter 8]. Lemma 6.24. Consider the set of nite dimensional linear time varying systems G described by _ x(t) = A(t)x(t) +B(t)u(t) y(t) = C(t)x(t) +D(t)u(t) where A 2 L n n 1 , B 2 L n m 1 , C 2 L p n 1 and D 2 L p m 1 on the interval [0; ] or [0;1). Then, if is nite, the induced norm of G on the nite interval [0; ] is a continuous function of the state space matrices with the uniform metric. On [0;1), if A is exponentially stable, then there exists a neighbourhood of the state space matrices in which every system is exponentially stable, and further kGk is continuous there. Proof. Let G1 = A1 B1 C1 D1 G2 = A2 B2 C2 D2 6.4 Terminal constraints 114 with states x1 and x2 respectively. De ne p = x1 x2 and q = x1 + x2, and A = A2 A1, B = B2 B1, C = C2 C1, and D = D2 D1. We apply the usual ", arguments for continuity, and show that, given any " > 0, there exists a > 0 such that max(kA k1; kB k1; kC k1; kD k1) < implies that kG1 G2k ". Then we can rewrite the system equation z = (G1 G2)w as _ p_ q =  pq + B̂w z = Ĉ pq + D̂w where  = A1 + A 2 A 2 A 2 A1 + A 2 B̂ = B 2B1 +B Ĉ = C + C 2 C 2 D̂ = D Then, from Lemma 5.9 if there exists a bounded matrix X such that _ X + Â0X +XÂ+ Ĉ 0Ĉ + (XB̂ + Ĉ 0D̂)( 2I D̂0D̂) 1(B̂0X + D̂0Ĉ) "1I for some constant "1 > 0, then kG1 G2k < . Let X = X1 0 0 X2 then if A = 0, B = 0, C = 0, and D = 0, we arrive at the equivalent Riccati equations "1I A01X1 +X1A1 + _ X1 + C 0 1C1 "1I A01X2 +X2A1 + _ X2 + 4 2X2B1B0 1X2: In the nite horizon case, uniformly bounded solutions to these equations always exist, since they correspond to equations for the controllability and inverse of the observability gramians. In the in nite horizon case, solutions will always exist if A1 is exponentially stable. Since the set of uniformly negative de nite matrices is open, Lemma 6.23 implies that choosing A , B , C and D su ciently small in the uniform norm will ensure that the above 2n 2n Riccati equation still has a uniformly bounded solution, and that 2I D̂0D̂ is uniformly invertible, from which the result follows. Corollary 6.25. Let be a set of matrix functions A, B, C and D de ned on [t0; t1]. Suppose there exists > 0 such that max(kA(t)k; kB(t)k; kC(t)k; kD(t)k) < for all t 0 and for all (A;B;C;D) 2 . Then there exists > 0 such that, for any (A;B;C;D) 2 , the induced norm of the corresponding linear time varying system on [t0; t1] is less than . Lemma 6.26. De ne the observability gramians Gi(t0; t1) := Z t1 t0 (t0; r)Bi(r)Bi(r)0 (t0; r)0 dr 6.4 Terminal constraints 115 for i = 1; 2, where is the transition matrix for A. Suppose that there exists > 0 such that, for all t 0, 1 I < G2(t; t+ T ) < I G1(t; t+ T ) < I: Then there exists > 0 such that, for each t 0, the Riccati di erential equation (6.19) has a bounded solution on each interval [t; t+T ] with boundary condition R(t+T; t+T ) = 0. Further, there exists > 0 such that R(t; t+ T ) < I for all t 0. Proof. De ne the operators i : L2[t; t + T ]! Rn by ir := Z t+T t (t; s)Bi(s)u(s) ds for i = 1; 2. Then Gi(t; t+ T ) = i i . Then, for given inputs u[t;t+T ], w[t;t+T ], the state x(t + T ) is given by x(t + T ) = 1w[t;t+T ] + 2u[t;t+T ]: Therefore, the condition that the state x(t+ T ) be zero is equivalent to the restriction that u[t;t+T ] satisfy u[t;t+T ] = 2( 2 2) 1 1w[t;t+T ] + v = Ltw[t;t+T ] + v for some v in the kernel of 2. From the assumptions in the Lemma, it is easy to see that kLtk < 2 for all t 0. At each t, we can write z[t;t+T ] = Ftw[t;t+T ] + Gtu[t;t+T ] = (Ft + GtLt)w[t;t+T ] + v: The map Ft is de ned by the state space matrices A, B1, C1, and D11, which by assumption are bounded over time, and hence there exists 1 > 0 such that kFtk < 1 for all t 0. Similarly, there exists 2 > 0 such that kGtk < 2 for all t 0. Therefore, for each t > 0, the controller can always achieve a nite horizon induced norm on the interval [t; t + T ] less than 1 + 2 , by choosing v = 0, and satisfy the terminal constraint. Thus, if > 1 + 2 , there exists a bounded solution to each Riccati di erential equation (5.1) on every interval of the form [t; t+ T ]. Finally, we have to show that the solution R(t; t + T ) is bounded over all t. This follows since by Lemma 6.23 the solution R(t; t + T ) is continuous in the state space matrices of the system. Since these are bounded, R(t; t+ T ) is bounded also. Remark 6.27. A dual theorem statement holds for the ltering Riccati equation with an in nite initial condition. However, the gramians in this case depend on X1, so this does not give rise to a testable condition. We could of course rewrite the problem in terms of the independent X1 and Y1 equations, with a coupling condition. In this case, we could use this Lemma to guarantee boundedness of both Riccati equations, but not to guarantee that the coupling condition would not fail. We can now state the following theorem, which has a terminal constraint for the state feedback problem. 6.5 Recursive computation of X1 116 Theorem 6.28. Let the system G be de ned by equations (6.11) satisfying assumptions (1)(5). Suppose also that G2(t; t+T ) and G1(t; t+T ) satisfy the conditions of Lemma 6.26. Then, let > 0 be such that there exists a bounded solution to the Riccati di erential equation _ R = (A B2D0 12C1)R +R(A B2D0 12C1)0 +RC 0 1D? 12D?0 12C1R (B2B0 2 2B1B0 1) on the interval [t; t+T ] with boundary condition R(t+T; t+T ) = 0, and with R(t; t+T ) bounded over all t 0. Then the state feedback controller u = (B2(t)0R(t; t+ T ) +D12(t)0C1(t))x(t) exponentially stabilizes the closed loop, and results in a closed loop induced norm which satis es kTzwk . 6.5 Recursive computation of X1 In order to implement the moving horizon controller, we need to calculate the value of X1(t; t + T; F (t + T )) for all t > 0. This requires the solution of a Riccati di erential equation over the interval [t; t+T ] for each time t, given boundary conditions X1(t; t+ T; F (t+T )) = F (t+T ) for all t > 0. Kwon, Bruckstein and Kailath [45] applied results from scattering theory in their solution to the quadratic problem, to give a forwards di erential equation for X1(t; t+T; F (t+T )). Here we state the same solution for the inde nite Riccati equation. These results are derived in [73] and [59] for the case when the Riccati di erential equation has a positive de nite quadratic term. In this section, P ( ; ) is used to mean X1( ; ; 0). De ne for convenience N( ) = B2( )B2( )0 2B1( )B1( )0 and let S( ; ) = ( ; ) L( ; ) P ( ; ) ( ; ) : Consider the system of equations @ @ ( ; ) = ( ; )[N( )P ( ; ) A( )] (6.20) @ @ ( ; ) = [P ( ; )N( ) A0( )] ( ; ) (6.21) @ @ L( ; ) = ( ; )N( ) ( ; ) (6.22) @ @ P ( ; ) = A0( )P ( ; ) + P ( ; )A( ) P ( ; )N( )P ( ; ) + C1( )0C1( ) (6.23) with boundary conditions S( ; ) = I 0 0 I : 6.5 Recursive computation of X1 117 The following equations @ @ ( ; ) = [A( ) + L( ; )C1( )0C1( )] ( ; ) @ @ ( ; ) = ( ; )[A0( ) + C1( )0C1( )L( ; )] @ @ L( ; ) = A( )L( ; ) + L( ; )A0( ) + L( ; )C1( )0C1( )L( ; ) N( ) @ @ P ( ; ) = ( ; )C1( )0C1( ) ( ; ) give the partial derivatives with respect to . It is straightforward to verify this by taking both sets of second partial derivatives and showing that they are equal, and verifying that along the boundary the total derivative d dtS(t; t) = @S( ; t) @ =t + @S(t; ) @ =t is zero, since ; ; L; P are constant along the boundary. We can nd a di erential equation for P (t; t+ T ) in terms of t since d dtS(t; t+ T ) = @S( ; t + T ) @ =t + @S(t; ) @ =t+T Hence d dt = [N(t)P A(t)] + [A(t+ T ) + LC1(t+ T )0C1(t+ T )] (6.24) d dt = [PN(t) A0(t)] + [A0(t+ T ) + C1(t+ T )0C1(t+ T )L] (6.25) dL dt = N(t) + A(t+ T )L + LA0(t + T ) + LC1(t+ T )0C1(t+ T )L N(t + T ) (6.26) dP dt = A0(t)P + PA(t) PN(t)P + C1(t)0C1(t) + C1(t + T )0C1(t+ T ) (6.27) where = (t; t + T ) etc. This gives a di erential equation for P (t; t + T; 0). The boundary conditions for equations (6.24{6.27) are given by solving equations (6.20{ 6.23) backwards in time from = = T to = 0; = T . This backwards solution has to be performed only once, when t = 0. We now make use of the following identity. X1( ; ; F ( )) = P ( ; ) + ( ; )F ( )[I L( ; )F ( )] 1 ( ; ) This can be veri ed in a straightforward manner. Then X1(t; t+ T; F (t+ T )) = P + F (t+ T )[I LF (t+ T )] 1 These equations are simpler than they appear, since = 0. They give us a quadratic di erential equation in matrices of size 2n, which is integrated forwards to give the controller. 6.6 Robust Performance and LMIs 118 6.6 Robust Performance and LMIs We would now like to discuss the advantages of the moving horizon control synthesis technique presented in this chapter. Indeed, while it is a very interesting theoretical result as it stands, that using only exact knowledge of the plant parameters up to some nite time ahead, we can both stabilize and guarantee an induced norm bound over all time for a linear time varying system, it is not so clear what the merits of this scheme are in a practical sense. The often cited advantages of predictive control, that it gives extra tuning parameters, are not so clear in the H1 context. In particular, almost always the H1 performance index is not really a performance index. The system G does not correspond to the lumped physical plant, and disturbance attenuation is not the goal. The generalizedH1 suboptimal synthesis problem is of primary interest because many other problems can be recast into this form. The construction of a controller which is robust to a particular perturbation class, whilst at the same time o ering good performance with respect to other, more easily describable engineering speci cations is performed via H1 synthesis, but with many design considerations taken into account. These would include, for example, the construction of appropriate weights and scaling matrices. Not only this, but if after the loop has been shifted into the form of the generalized problem it is discovered that the problem is singular or close to singular, then often the design speci cations themselves would need to be reconsidered. A good discussion of the techniques associated with this methodology can be found in, for example, Dahleh and Diaz-Bobillo [21]. This stated H1 system methodology, when transferred to the moving horizon context, brings problems. Currently, use of the `tuning parameters', the horizon length T and the matrices F and G, as extra design `knobs' does not look hopeful, when the generalized problem is so far removed from the physical problem at hand. Not only this, but one of the key stated advantages was that future plant variation could be dealt with in the future, that is, that the control signal now did not explicitly depend on the time variation of the plant in the future after time t + T . For this to be the case, it is necessary that the plant changes are such that redesign of the weights is not necessary, and that the problem does not become singular or poorly posed. However, we hope that these assumptions on the plant are not too stringent. In the above discussion, we have been rather vague about an important point. This is that, although the controller does not depend explicitly on the system matrices after time t+ T , we are required to make some choices of F and G in order to construct the controller. The choice of these matrices determines the allowed variation of the plant in the future. The following two Lemmas express the Riccati di erential inequalities as linear matrix inequalities. First de ne NQ = I 0 D0 12Cz D?0 21 NR= I 0 D12B0 2 D? 12 : Lemma 6.29. There exist constants "; > 0, and a bounded matrix function X > I 6.6 Robust Performance and LMIs 119 such that _ X + (A B2D0 12C1)0X +X(A B2D0 12C1) X(B2B0 2 2B1B0 1)X + C 0 1D? 12D?0 12C1 < "I (6.28) if and only if there exist constant "; > 0 and a bounded matrix function R > I such that NR 0 0 I 0 24 _ R + AR +RA0 + "I RC 0 1 B1 C1R I 0 B0 1 0 I35 NR 0 0 I < 0 (6.29) In this case, one possible choice is R = X 1. Proof. First, it is easy to see that (6.29) is satis ed if and only if 24 _ R + (A B2D0 12C1)R + (A B2D0 12C1)0R B2B0 2 + "I RC 0 1D? 12 B1 D?0 12C1R I 0 B0 1 0 I35 < 0 Successively applying the Schur complement formula gives the desired result. Lemma 6.30. There exist constants "; > 0, and a bounded matrix function Z > I such that _ Z + AzZ + ZA0z Z(C 0 zCz 2C 0 kCk)Z +B1D?0 21D? 21B0 1 < "I if and only if there exist constants "; > 0 and a bounded matrix function Q > I such that NQ 0 0 I 0 24 _ Q + ~ A0Q +Q ~ A+ "I QB1 C 0 k B0 1Q I 0 Ck 0 I35 NQ 0 0 I < 0 (6.30) where ~ A = A + 2B1B0 1X. In this case, one possible choice is Q = Z 1. Remark 6.31. Note that, in both the linear matrix inequalities written above, it is possible to remove the "I term from the 1; 1 block of the matrix, and instead solve the LMI uniformly over time. Written in this form, we gain another viewpoint advantage. This shows an interesting property of the moving horizon strategy, which might be regarded as robust performance. Replace X and Z in the above LMIs by F and G. Then, for any plant whose system matrices satisfy the inequalities (6.29) and (6.30), the controller de ned by F and G will result in a closed loop induced norm less than or equal to . Note that inequalities (6.29) and (6.30) are a ne, and hence convex, in A and C1, but not in B1 for example. However, it is worth noting that, as described above, this robust performance condition is not very useful. Typically in robust control, the set of plants for which one desires to achieve a given performance level is speci ed in advance. Then the goal of the robust control strategy is, if a strategy which will achieve the performance level 6.6 Robust Performance and LMIs 120 for all plants in the set exists, to give a construction for the controller. The important point, however, is that the robust control strategy must also tell the designer if there does not exist any strategy which will achieve the desired performance level. We might also like the design technique to indicate where the trade-o s should be made, etc. If we rst choose F and G, and then discover which set of plants around the nominal we can give robust performance for, then we have not ful lled this goal. An example of this is the idea that all linear time invariant strictly stabilizing controllers are robustly stabilizing, in the sense that there exists some ball of plants around the nominal which is also strictly stabilized. However, the question is not whether some ball exists, but whether the particular ball we need is stabilized However, this is compensated for in the moving horizon approach by the fact that we do not need to choose F and G in advance. Instead, at time t we can choose F and G according to our knowledge of the system matrices. Therefore, it is possible to have the controller vary according to variations in the plant, rather than design in advance for all possible future variations in the plant. There is an important di erence between this moving horizon approach and the parameter varying approaches of [55, 79]. The parameter varying approach is to consider the system matrices A, B, C, D as functions of a time varying parameter (t). Then _ X can be replaced by @X @ _ . Then, since the LMIs are convex in , we can attempt to nd a solution X( ) for all parameter values j _ j < , for some constant . This is usually approached by gridding the parameter space. In this case, the controller achieves an induced 2-norm of for all trajectories of the parameter which satisfy the rate limitation. The controller may either be independent of the parameter as in [55] or, less conservatively, may depend on the parameter and its rate of change as in [79]. Note that the parameter varying approach gives only su cient conditions for existence of a controller. However, in the moving horizon approach, we do not synthesize for all possible trajectories of a parameter, or equivalently for all possible variations of the system matrices within some set. Instead we use the parameter value in real time to construct a controller for the system. The key di erence is that we do not need existence of a parameter dependent X which will satis es the LMI for all parameter variations, but instead simply need to satisfy it for the particular trajectory on which the system is moving. Again, this appears to have moved us away from the robust control paradigm, since we no longer give some speci cation in advance of the allowable parameter space. We have so far constructed solutionsX1 and Z1 by integrating over a nite horizon, and choosing F and G appropriately. The nite horizon integrations correspond to well de ned moving horizon control and estimation problems. However, we could have chosen the intervals di erently, for example so that they overlapped. In the current framework, since the solution to the ltering equation Z1 depends on the solution X1, we have to integrate the control and ltering equations over separate intervals, so that Z1 is integrated over [t T2; t] and X1 and integrated over [t; t+ T ]. The LMI approach to controller synthesis produces results in a di erent form. It results in linear matrix inequalities for R = X 1 1 and S = Y 1 1 , together with a coupling condition X1 2Y 1 1 < 0. These inequalities are independent, so, using similar methods to those in this chapter, we could show that integrating the equations 6.7 Summary 121 over a nite interval with an appropriate boundary condition will give a solution to the inequality on an in nite interval. We would, of course, still have to satisfy the coupling condition. This makes it possible for us to choose the nite horizon control and ltering intervals so that they overlap. That is, we can choose to integrate both control and ltering equations either forwards or backwards in time. Unfortunately, the controller formulas in terms of X1 and Y1 also depend on _ X1, see for example Wu. et. al. [79]. Since we are looking for solutions to the Riccati di erential equations with no speci ed boundary conditions, one possible solution might be to pick an initial boundary condition and integrate forwards. However, the problem is how to choose the initial value. We would like to be able to choose a boundary condition which maximizes the set of plants for which our controller is -feasible. Consider the Riccati di erential inequality for X _ X + (A B2D0 12C1)0X +X(A B2D0 12C1) X(B2B0 2 2B1B0 1)X + C 0 1D? 12D?0 12C1 < "I: Since this is an inequality, it is clear that given any > 0, we can ensure that X < I for all t > 0 by choosing _ X su ciently negative. However, in order to use this as the quadratic term of a Lyapunov function to guarantee closed loop stability, we need to ensure that X > 0 for all t 2 R+. In fact, Proposition 3.19 shows that if X is a solution to the Riccati di erential equation that, integrating forwards, becomes negative de nite at some time t1, then for all t > t1, X(t) 6 0. We would like to have a result of the form of the following: if there exists a matrix function X1 satisfying the above Riccati di erential inequality, then for any initial condition Q > X1(0), we can nd a solution to the Riccati di erential inequality also, by suitably choosing the degree to which the inequality is satis ed. Unfortunately, it is not clear if this is true or false. 6.7 Summary In this chapter we have described a generalization of the linear quadratic moving horizon control formulation to theH1 problem. We have shown that, for both state and output feedback problems, with a suitable choice of initial and nal conditions it is possible to construct controllers which are both stabilizing and give an induced norm bound over all time. This form of controller uses only the explicit description of the system up to some nite time ahead, and from some nite time in the past. We have shown that for linear time invariant systems the resulting controller is linear time invariant, and that the boundary conditions required for norm boundedness correspond to monotonicity conditions on the solutions of the Riccati equations. We have also given a general recursive technique for the computation of the solution to the Riccati partial di erential equations required. 7. DISCRETE INTERVAL MOVING HORIZON CONTROL 7.
منابع مشابه
Robust Controller Design for IG Driven by Variable-Speed in WECS Using μ-Synthesis
This paper presents robust controller design for a wind-driven induction generator system using structured singular value ( -synthesis) method. The controller was designed for a static synchronous compensator (STATCOM) and a variable blade pitch angle in a wind energy conversion system (WECS) in order to achieve the required voltage and mechanical power control. The results indicated that this ...
متن کاملDevelopment of RMPC Algorithm for Compensation of Uncertain Time-Delay and Disturbance in NCS
In this paper, a synthesis method based on robust model predictive control is developed for compensation of uncertain time-delays in networked control systems with bounded disturbance. The proposed method uses linear matrix inequalities and uncertainty polytope to model uncertain time-delays and system disturbances. The continuous system with time-delay is discretized using uncertainty po...
متن کاملRobust Fuzzy Gain-Scheduled Control of the 3-Phase IPMSM
This article presents a fuzzy robust Mixed - Sensitivity Gain - Scheduled H controller based on the Loop -Shaping methodology for a class of MIMO uncertain nonlinear Time - Varying systems. In order to design this controller, the nonlinear parameter - dependent plant is first modeled as a set of linear subsystems by Takagi and Sugeno’s (T - S) fuzzy approach. Both Loop - Shaping methodology and...
متن کاملHigh-Performance Robust Three-Axis Finite-Time Attitude Control Approach Incorporating Quaternion Based Estimation Scheme to Overactuated Spacecraft
With a focus on investigations in the area of overactuated spacecraft, a new high-performance robust three-axis finite-time attitude control approach, which is organized in connection with the quaternion based estimation scheme is proposed in the present research with respect to state-of-the-art. The approach proposed here is realized based upon double closed loops to deal with the angular rate...
متن کاملکنترل کننده پیشفاز-پسفاز مقاوم برای تولیدات پراکنده در شرایط جزیرهای
Distributed generations that are connected to the network via a converter, employ dq current control method to control their active and reactive power components in grid-connected mode. In this paper a simple lead-lag control strategy is proposed for a distributed generation (DG) unit in island mode. When it is connected to the utility grid, the DG is controlled by a conventional dq-current con...
متن کاملA Robust Control Strategy for Distributed Generations in Islanded Microgrids
This paper presents a robust control scheme for distributed generations (DGs) in islanded mode operation of a microgrid (MG). In this strategy, assuming a dynamic slack bus with constant voltage magnitude and phase angle, nonlinear equations of the MG are solved in the slack-voltage-oriented synchronous reference frame, and the instantaneous active and reactive power reference for the slack bus...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995